Skip to main content
HomeCode-alongsArtificial Intelligence (AI)

Evaluating LLM Responses

In this session, we cover the different evaluations that are useful for reducing hallucination and improving retrieval quality of LLMs.
Nov 2023
Code along with us onCode Along

View Slides

LLMs should be considered hallucinatory until proven otherwise! A lot of us have turned to augmenting LLMs with a knowledge store (such as Zilliz) to solve this problem. But this RAG setup can still face issues with hallucination. In particular - this can be caused from retrieving irrelevant context, not enough context, and more.

TruLens is built to solve this problem. TruLens sits as the evaluation layer for the LLM stack, allowing you to shorten the feedback loop and iterate on your LLM app faster. We'll also talk about the different metrics you can use for evaluation and why you should consider LLM-based evals when building your app.

Key Takeaways:

  • Learn about common failure modes for LLM apps
  • Learn the different evaluations that are useful for reducing hallucination, improving retrieval quality & more.
  • Learn about how to evaluate LLM apps with TruLens

Additional Resources

TruLens Documentation

TruLens GitHub

Find the prompts used for LLM-based feedback functions in TruLens' open-source github repository here.

[SKILL TRACK] AI Fundamentals

[COURSE] Working with the OpenAI API

[TUTORIAL] How to Build LLM Applications with LangChain

Topics
Related

blog

7 Artificial Intelligence (AI) Jobs You Can Pursue in 2024

Explore the top 7 AI careers in 2024, from cutting-edge research to hands-on engineering.

Nahla Davies

15 min

podcast

Data & AI Trends in 2024, with Tom Tunguz, General Partner at Theory Ventures

Richie and Tom explore trends in generative AI, the impact of AI on professional fields, cloud+local hybrid workflows, data security, the future of business intelligence and data analytics, the challenges and opportunities surrounding AI in the corporate sector and much more.
Richie Cotton's photo

Richie Cotton

38 min

tutorial

Reinforcement Learning: An Introduction With Python Examples

Learn the fundamentals of reinforcement learning through the analogy of a cat learning to use a scratch post.
Bex Tuychiev's photo

Bex Tuychiev

14 min

tutorial

Python KeyError Exceptions and How to Fix Them

Learn key techniques such as exception handling and error prevention to handle the KeyError exception in Python effectively.
Javier Canales Luna's photo

Javier Canales Luna

6 min

tutorial

Run LLMs Locally: 7 Simple Methods

Run LLMs locally (Windows, macOS, Linux) by leveraging these easy-to-use LLM frameworks: GPT4All, LM Studio, Jan, llama.cpp, llamafile, Ollama, and NextChat.
Abid Ali Awan's photo

Abid Ali Awan

14 min

code-along

Getting Started with Machine Learning Using ChatGPT

In this session Francesca Donadoni, a Curriculum Manager at DataCamp, shows you how to make use of ChatGPT to implement a simple machine learning workflow.
Francesca Donadoni's photo

Francesca Donadoni

See MoreSee More