LLM - Evaluation
Evaluation
overall
Sample
LlamaIndex
1
2
3
4
5
6
7
8
from deepeval.integrations.llama_index import (
DeepEvalAnswerRelevancyEvaluator,
DeepEvalFaithfulnessEvaluator,
DeepEvalContextualRelevancyEvaluator,
DeepEvalSummarizationEvaluator,
DeepEvalBiasEvaluator,
DeepEvalToxicityEvaluator,
)
Evaluating Response Faithfulness (i.e. Hallucination)
- The
FaithfulnessEvaluator
evaluates if the answer is faithful to the retrieved contexts (in other words, whether if there’s hallucination).
Evaluating Query + Response Relevancy
- The
RelevancyEvaluator
evaluates if the retrieved context and the answer is relevant and consistent for the given query.
This post is licensed under CC BY 4.0 by the author.
Comments powered by Disqus.