Skip to content

LLM

Retrieval is more than pure chance

Coldstarting Rag Evaluations

Without a method to evaluate the quality of your RAG application, we might as well be leaving its performance to pure chance. In this article, we'll walk you through a simple example to demonstrate how easy it is to get started.

We'll start by using Instructor to generate synthethic data. We'll then chunk and embed some Paul Graham Essays using lancedb. Next, we'll showcase two useful metrics that we can use to track the performance of our retrieval before concluding with some interesting improvements to iteratively generate harder evaluation datasets.

Most importantly, the code used in this article is avaliable inside the /code/synthethic-evals folder. We've also included some Paul Graham essays in the same folder for easy use.

Let's start by first installing the necessary libraries

pip install instructor openai scikit-learn rich lancedb tqdm

Generating Evaluation Data

Set your OPENAI_API_KEY

Before proceeding with the rest of this tutorial, make sure to set your OPENAI_API_KEY inside your shell. You can do so using the command

>> export OPENAI_API_KEY=<api key>

Given a text-chunk, we can use Instructor to generate a corresponding question using the content of the question. This means that when we make a query using that question, our text chunk is ideally going to be the first source returned by our retrieval algorithm.

We can represent this desired result using a simple pydantic BaseModel.

Defining a Data Model

class QuestionAnswerPair(BaseModel):
    """
    This model represents a pair of a question generated from a text chunk, its corresponding answer,
    and the chain of thought leading to the answer. The chain of thought provides insight into how the answer
    was derived from the question.
    """

    chain_of_thought: str = Field(
        ..., description="The reasoning process leading to the answer.", exclude=True
    )
    question: str = Field(
        ..., description="The generated question from the text chunk."
    )
    answer: str = Field(..., description="The answer to the generated question.")

Unraveling the History of Technological Skepticism

Technological advancements have always been met with a mix of skepticism and fear. From the telephone disrupting face-to-face communication to calculators diminishing mental arithmetic skills, each new technology has faced resistance. Even the written word was once believed to weaken human memory.

Technology Perceived Threat
Telephone Disrupting face-to-face communication
Calculators Diminishing mental arithmetic skills
Typewriter Degrading writing quality
Printing Press Threatening manual script work
Written Word Weakening human memory