Skip to content

RAG

Retrieval is more than pure chance

Coldstarting Rag Evaluations

Without a method to evaluate the quality of your RAG application, we might as well be leaving its performance to pure chance. In this article, we'll walk you through a simple example to demonstrate how easy it is to get started.

We'll start by using Instructor to generate synthethic data. We'll then chunk and embed some Paul Graham Essays using lancedb. Next, we'll showcase two useful metrics that we can use to track the performance of our retrieval before concluding with some interesting improvements to iteratively generate harder evaluation datasets.

Most importantly, the code used in this article is avaliable inside the /code/synthethic-evals folder. We've also included some Paul Graham essays in the same folder for easy use.

Let's start by first installing the necessary libraries

pip install instructor openai scikit-learn rich lancedb tqdm

Generating Evaluation Data

Set your OPENAI_API_KEY

Before proceeding with the rest of this tutorial, make sure to set your OPENAI_API_KEY inside your shell. You can do so using the command

>> export OPENAI_API_KEY=<api key>

Given a text-chunk, we can use Instructor to generate a corresponding question using the content of the question. This means that when we make a query using that question, our text chunk is ideally going to be the first source returned by our retrieval algorithm.

We can represent this desired result using a simple pydantic BaseModel.

Defining a Data Model

class QuestionAnswerPair(BaseModel):
    """
    This model represents a pair of a question generated from a text chunk, its corresponding answer,
    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer
    was derived from the question.
    """

    chain_of_thought: str = Field(
        ..., description="The reasoning process leading to the answer.", exclude=True
    )
    question: str = Field(
        ..., description="The generated question from the text chunk."
    )
    answer: str = Field(..., description="The answer to the generated question.")

Levels of Complexity: RAG Applications

This post comprehensive guide to understanding and implementing RAG applications across different levels of complexity. Whether you're a beginner eager to learn the basics or an experienced developer looking to deepen your expertise, you'll find valuable insights and practical knowledge to help you on your journey. Let's embark on this exciting exploration together and unlock the full potential of RAG applications.

If you want to learn about my consulting practice check out my services page. If you're interested in working together please reach out to me via email

This is a work in progress and mostly an outline of what I want to write. I'm mostly looking for feedback