Anthropic's Contextual Retrieval: A Guide With Implementation-AI-php.cn

Anthropic's Contextual Retrieval: A Guide With Implementation

William Shakespeare

Release： 2025-03-02 09:34:12

Original

451 people have browsed it

Retrieval-augmented generation (RAG) enhances AI models by integrating external knowledge. However, traditional RAG often fragments documents, losing crucial context and impacting accuracy.

Anthropic's contextual retrieval addresses this by adding concise, context-rich explanations to each document chunk before embedding. This significantly reduces retrieval errors, leading to improved downstream task performance. This article details contextual retrieval and its implementation.

RAG with LangChain

Leverage LangChain and RAG to integrate external data with LLMs.

Contextual Retrieval Explained

Traditional RAG methods divide documents into smaller chunks for easier retrieval, but this can eliminate essential context. For instance, a chunk might state "Its more than 3.85 million inhabitants make it the European Union's most populous city" without specifying the city. This lack of context hinders accuracy.

Contextual retrieval solves this by prepending a short, context-specific summary to each chunk before embedding. The previous example would become:

<code>contextualized_chunk = """Berlin is the capital and largest city of Germany, known for being the EU's most populous city within its limits.
Its more than 3.85 million inhabitants make it the European Union's most populous city, as measured by population within city limits.
"""</code>

Copy after login

Anthropic's Contextual Retrieval: A Guide With Implementation

Anthropic's internal testing across diverse datasets (codebases, scientific papers, fiction) demonstrates that contextual retrieval reduces retrieval errors by up to 49% when paired with contextual embedding models and Contextual BM25.

Anthropic's Contextual Retrieval: A Guide With Implementation

Implementing Contextual Retrieval

This section outlines a step-by-step implementation using a sample document:

<code># Input text for the knowledge base
input_text = """Berlin is the capital and largest city of Germany, both by area and by population.
Its more than 3.85 million inhabitants make it the European Union's most populous city, as measured by population within city limits.
The city is also one of the states of Germany and is the third smallest state in the country in terms of area.
Paris is the capital and most populous city of France.
It is situated along the Seine River in the north-central part of the country.
The city has a population of over 2.1 million residents within its administrative limits, making it one of Europe's major population centers."""</code>

Copy after login

Step 1: Chunk Creation

Divide the document into smaller, independent chunks (here, sentences):

<code># Splitting the input text into smaller chunks
test_chunks = [
    'Berlin is the capital and largest city of Germany, both by area and by population.',
    "\n\nIts more than 3.85 million inhabitants make it the European Union's most populous city, as measured by population within city limits.",
    '\n\nThe city is also one of the states of Germany and is the third smallest state in the country in terms of area.',
    '\n\n# Paris is the capital and most populous city of France.',
    '\n\n# It is situated along the Seine River in the north-central part of the country.',
    "\n\n# The city has a population of over 2.1 million residents within its administrative limits, making it one of Europe's major population centers."
]</code>

Copy after login

Step 2: Prompt Template Definition

Define the prompt for context generation (Anthropic's template is used):

<code>from langchain.prompts import ChatPromptTemplate, PromptTemplate, HumanMessagePromptTemplate

# Define the prompt for generating contextual information
anthropic_contextual_retrieval_system_prompt = """<document>
{WHOLE_DOCUMENT}
</document>
Here is the chunk we want to situate within the whole document
<chunk>
{CHUNK_CONTENT}
</chunk>
Please give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk. Answer only with the succinct context and nothing else."""

# ... (rest of the prompt template code remains the same)</code>

Copy after login

Step 3: LLM Initialization

Choose an LLM (OpenAI's GPT-4o is used here):

<code>import os
from langchain_openai import ChatOpenAI

# Load environment variables
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

# Initialize the model instance
llm_model_instance = ChatOpenAI(
    model="gpt-4o",
)</code>

Copy after login

Step 4: Chain Creation

Connect the prompt and LLM:

<code>from langchain.output_parsers import StrOutputParser

# Chain the prompt with the model instance
contextual_chunk_creation = anthropic_contextual_retrieval_final_prompt | llm_model_instance | StrOutputParser()</code>

Copy after login

Step 5: Chunk Processing

Generate context for each chunk:

<code># Process each chunk and generate contextual information
for test_chunk in test_chunks:
    res = contextual_chunk_creation.invoke({
        "WHOLE_DOCUMENT": input_text,
        "CHUNK_CONTENT": test_chunk
    })
    print(res)
    print('-----')</code>

Copy after login

(Output is shown in the original example)

Reranking for Enhanced Accuracy

Reranking further refines retrieval by prioritizing the most relevant chunks. This improves accuracy and reduces costs. In Anthropic's tests, reranking decreased retrieval errors from 5.7% to 1.9%, a 67% improvement.

Anthropic's Contextual Retrieval: A Guide With Implementation

Additional Considerations

For smaller knowledge bases (<200,000 tokens), including the entire knowledge base directly in the prompt might be more efficient than using retrieval systems. Also, utilizing prompt caching (available with Claude) can significantly reduce costs and improve response times.

Conclusion

Anthropic's contextual retrieval offers a straightforward yet powerful method to improve RAG systems. The combination of contextual embeddings, BM25, and reranking enhances accuracy substantially. Further exploration of other retrieval techniques is recommended.

The above is the detailed content of Anthropic's Contextual Retrieval: A Guide With Implementation. For more information, please follow other related articles on the PHP Chinese website!