🚨 Breaking Changes

⚠️ This release introduces breaking changes in preparation for DeepEval v3.0.
Please review carefully and adjust your code as needed.

The `evaluate()` function now has "configs"

Previously the evaluate() function had 13+ arguments to control display, async behaviors, caching, etc. and it was growing out of control. We've now abstracted it into "configs" instead:

from deepeval.evaluate.configs import AsyncConfig
from deepeval import evaluate

evaluate(..., async_config=AsyncConfig(max_concurrent=20))

Full docs here: https://www.deepeval.com/docs/evaluation-running-llm-evals#configs-for-evaluate

Red Teaming Officially Migrated to DeepTeam

This shouldn't be a surprised but, DeepTeam now takes care of everything red teaming related, for the foreseeable future. Docs here: https://trydeepteam.com

🥳 New Feature

Dynamic Evaluations for Nested Components

Nested components are a mess to evaluate. In this version in preparation for v3.0, we introduced dynamic evals, where you can apply a different set of metrics for different components in your LLM application:

from deepeval.test_case import LLMTestCase
from deepeval.metrics import AnswerRelevancyMetric
from deepeval.tracing import observe, update_current_span_test_case

@observe(metrics=[AnswerRelevancyMetric()])
def complete(query: str):
  response = openai.ChatCompletion.create(model="gpt-4o", messages=[{"role": "user", "content": query}]).choices[0].message["content"]

  update_current_span_test_case(
    test_case=LLMTestCase(input=query, output=response)
  )
  return response

Full docs here: https://www.deepeval.com/docs/evaluation-running-llm-evals#setup-tracing-highly-recommended

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v3.0 Pre-Release

🚨 Breaking Changes

The `evaluate()` function now has "configs"

Red Teaming Officially Migrated to DeepTeam

🥳 New Feature

Dynamic Evaluations for Nested Components

Uh oh!

v3.0 Pre-Release

🚨 Breaking Changes

The evaluate() function now has "configs"

Red Teaming Officially Migrated to DeepTeam

🥳 New Feature

Dynamic Evaluations for Nested Components

Uh oh!

The `evaluate()` function now has "configs"