Enhancing RAG: Beyond Vanilla Approaches-AI-php.cn

Table of Contents

Limitations of Basic RAG

Helper Functions: Streamlining the RAG Pipeline

Evaluating Basic RAG

Advanced Techniques for Enhanced RAG

1. Generating FAQs

2. Creating an Overview

3. Query Decomposition

Evaluating the Enhanced RAG

Cost-Benefit Analysis

Conclusion

Home

Technology peripherals

Enhancing RAG: Beyond Vanilla Approaches

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Feb 25, 2025 pm 04:38 PM

Enhancing RAG: Beyond Vanilla Approaches

Retrieval-Augmented Generation (RAG) significantly boosts language models by integrating external information retrieval. Standard RAG, while improving response relevance, often falters in complex retrieval situations. This article examines the shortcomings of basic RAG and presents advanced methods to improve accuracy and efficiency.

Limitations of Basic RAG

Consider a simple scenario: retrieving relevant information from several documents. Our dataset includes:

A primary document detailing healthy, productive lifestyle practices.
Two unrelated documents containing some overlapping keywords, but in different contexts.

<code>main_document_text = """
Morning Routine (5:30 AM - 9:00 AM)
✅ Wake Up Early - Aim for 6-8 hours of sleep to feel well-rested.
✅ Hydrate First - Drink a glass of water to rehydrate your body.
✅ Morning Stretch or Light Exercise - Do 5-10 minutes of stretching or a short workout to activate your body.
✅ Mindfulness or Meditation - Spend 5-10 minutes practicing mindfulness or deep breathing.
✅ Healthy Breakfast - Eat a balanced meal with protein, healthy fats, and fiber.
✅ Plan Your Day - Set goals, review your schedule, and prioritize tasks.
...
"""</code>

Copy after login

A basic RAG system, when queried with:

How can I improve my health and productivity?
What are the best strategies for a healthy and productive lifestyle?

may struggle to consistently retrieve the primary document due to the presence of similar words in unrelated documents.

Helper Functions: Streamlining the RAG Pipeline

To improve retrieval accuracy and simplify query processing, we introduce helper functions. These functions handle tasks such as querying the ChatGPT API, calculating document embeddings, and determining similarity scores. This creates a more efficient RAG pipeline.

Here are the helper functions:

<code># **Imports**
import os
import json
import openai
import numpy as np
from scipy.spatial.distance import cosine
from google.colab import userdata

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = userdata.get('AiTeam')</code>

Copy after login

<code>def query_chatgpt(prompt, model="gpt-4o", response_format=openai.NOT_GIVEN):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}],
            temperature=0.0 , # Adjust for more or less creativity
            response_format=response_format
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"Error: {e}"</code>

Copy after login

<code>def get_embedding(text, model="text-embedding-3-large"): #"text-embedding-ada-002"
    """Fetches the embedding for a given text using OpenAI's API."""
    response = client.embeddings.create(
        input=[text],
        model=model
    )
    return response.data[0].embedding</code>

Copy after login

<code>def compute_similarity_metrics(embed1, embed2):
    """Computes different similarity/distance metrics between two embeddings."""
    cosine_sim = 1- cosine(embed1, embed2)  # Cosine similarity

    return cosine_sim</code>

Copy after login

<code>def fetch_similar_docs(query, docs, threshold = .55, top=1):
  query_em = get_embedding(query)
  data = []
  for d in docs:
    # Compute and print similarity metrics
    similarity_results = compute_similarity_metrics(d["embedding"], query_em)
    if(similarity_results >= threshold):
      data.append({"id":d["id"], "ref_doc":d.get("ref_doc", ""), "score":similarity_results})

  # Sorting by value (second element in each tuple)
  sorted_data = sorted(data, key=lambda x: x["score"], reverse=True)  # Ascending order
  sorted_data = sorted_data[:min(top, len(sorted_data))]
  return sorted_data</code>

Copy after login

Evaluating Basic RAG

We test the basic RAG using predefined queries to assess its ability to retrieve the most relevant document based on semantic similarity. This highlights its limitations.

<code>"""# **Testing Vanilla RAG**"""

query = "what should I do to stay healthy and productive?"
r = fetch_similar_docs(query, docs)
print("query = ", query)
print("documents = ", r)

query = "what are the best practices to stay healthy and productive ?"
r = fetch_similar_docs(query, docs)
print("query = ", query)
print("documents = ", r)</code>

Copy after login

Advanced Techniques for Enhanced RAG

To improve the retrieval process, we introduce functions that generate structured information to enhance document retrieval and query processing.

Three key enhancements are implemented:

1. Generating FAQs

Creating FAQs from the document expands the query matching possibilities. These FAQs are generated once and stored, enriching the search space without recurring costs.

<code>def generate_faq(text):
  prompt = f'''
  given the following text: """{text}"""
  Ask relevant simple atomic questions ONLY (don't answer them) to cover all subjects covered by the text. Return the result as a json list example [q1, q2, q3...]
  '''
  return query_chatgpt(prompt, response_format={ "type": "json_object" })</code>

Copy after login

2. Creating an Overview

A concise summary captures the document's core ideas, improving retrieval effectiveness. The overview's embedding is added to the document collection.

<code>def generate_overview(text):
  prompt = f'''
  given the following text: """{text}"""
  Generate an abstract for it that tells in maximum 3 lines what is it about and use high level terms that will capture the main points,
  Use terms and words that will be most likely used by average person.
  '''
  return query_chatgpt(prompt)</code>

Copy after login

3. Query Decomposition

Broad queries are broken down into smaller, more precise sub-queries. These sub-queries are compared against the enhanced document collection (original document, FAQs, and overview). Results are merged for improved relevance.

<code>main_document_text = """
Morning Routine (5:30 AM - 9:00 AM)
✅ Wake Up Early - Aim for 6-8 hours of sleep to feel well-rested.
✅ Hydrate First - Drink a glass of water to rehydrate your body.
✅ Morning Stretch or Light Exercise - Do 5-10 minutes of stretching or a short workout to activate your body.
✅ Mindfulness or Meditation - Spend 5-10 minutes practicing mindfulness or deep breathing.
✅ Healthy Breakfast - Eat a balanced meal with protein, healthy fats, and fiber.
✅ Plan Your Day - Set goals, review your schedule, and prioritize tasks.
...
"""</code>

Copy after login

Evaluating the Enhanced RAG

Re-running the initial queries with these enhancements shows significant improvement. Query decomposition generates multiple sub-queries, leading to successful retrieval from both the FAQs and the original document.

<code># **Imports**
import os
import json
import openai
import numpy as np
from scipy.spatial.distance import cosine
from google.colab import userdata

# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = userdata.get('AiTeam')</code>

Copy after login

Example FAQ output:

<code>def query_chatgpt(prompt, model="gpt-4o", response_format=openai.NOT_GIVEN):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": prompt}],
            temperature=0.0 , # Adjust for more or less creativity
            response_format=response_format
        )
        return response.choices[0].message.content.strip()
    except Exception as e:
        return f"Error: {e}"</code>

Copy after login

Cost-Benefit Analysis

While preprocessing (generating FAQs, overviews, and embeddings) adds an upfront cost, it's a one-time cost per document. This offsets the inefficiencies of a poorly optimized RAG system: frustrated users and increased query costs from retrieving irrelevant information. For high-volume systems, preprocessing is a worthwhile investment.

Conclusion

Combining document preprocessing (FAQs and overviews) with query decomposition creates a more intelligent RAG system that balances accuracy and cost-effectiveness. This enhances retrieval quality, reduces irrelevant results, and improves the user experience. Future research can explore further optimizations like dynamic thresholding and reinforcement learning for query refinement.

The above is the detailed content of Enhancing RAG: Beyond Vanilla Approaches. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks ago By DDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1664

CakePHP Tutorial

1421

Laravel Tutorial

1315

PHP Tutorial

1266

C# Tutorial

1239

Related knowledge

Getting Started With Meta Llama 3.2 - Analytics Vidhya Apr 11, 2025 pm 12:04 PM

Meta's Llama 3.2: A Leap Forward in Multimodal and Mobile AI Meta recently unveiled Llama 3.2, a significant advancement in AI featuring powerful vision capabilities and lightweight text models optimized for mobile devices. Building on the success o

10 Generative AI Coding Extensions in VS Code You Must Explore Apr 13, 2025 am 01:14 AM

Hey there, Coding ninja! What coding-related tasks do you have planned for the day? Before you dive further into this blog, I want you to think about all your coding-related woes—better list those down. Done? – Let&#8217

AV Bytes: Meta's Llama 3.2, Google's Gemini 1.5, and More Apr 11, 2025 pm 12:01 PM

This week's AI landscape: A whirlwind of advancements, ethical considerations, and regulatory debates. Major players like OpenAI, Google, Meta, and Microsoft have unleashed a torrent of updates, from groundbreaking new models to crucial shifts in le

Selling AI Strategy To Employees: Shopify CEO's Manifesto Apr 10, 2025 am 11:19 AM

Shopify CEO Tobi Lütke's recent memo boldly declares AI proficiency a fundamental expectation for every employee, marking a significant cultural shift within the company. This isn't a fleeting trend; it's a new operational paradigm integrated into p

GPT-4o vs OpenAI o1: Is the New OpenAI Model Worth the Hype? Apr 13, 2025 am 10:18 AM

Introduction OpenAI has released its new model based on the much-anticipated “strawberry” architecture. This innovative model, known as o1, enhances reasoning capabilities, allowing it to think through problems mor

A Comprehensive Guide to Vision Language Models (VLMs) Apr 12, 2025 am 11:58 AM

Introduction Imagine walking through an art gallery, surrounded by vivid paintings and sculptures. Now, what if you could ask each piece a question and get a meaningful answer? You might ask, “What story are you telling?

Newest Annual Compilation Of The Best Prompt Engineering Techniques Apr 10, 2025 am 11:22 AM

For those of you who might be new to my column, I broadly explore the latest advances in AI across the board, including topics such as embodied AI, AI reasoning, high-tech breakthroughs in AI, prompt engineering, training of AI, fielding of AI, AI re

3 Methods to Run Llama 3.2 - Analytics Vidhya Apr 11, 2025 am 11:56 AM

Meta's Llama 3.2: A Multimodal AI Powerhouse Meta's latest multimodal model, Llama 3.2, represents a significant advancement in AI, boasting enhanced language comprehension, improved accuracy, and superior text generation capabilities. Its ability t

See all articles