Production Guides

Building a Robust RAG App with Guardrails using Portkey and MongoDB

Introduction

Retrieval-Augmented Generation (RAG) has become a go-to technique for enhancing LLMs with external data. In this tutorial, we'll build a production-ready RAG application in just 10 minutes using Portkey, MongoDB Atlas, Patronus AI, and LlamaIndex.

But first, what is RAG?

Retrieval-Augmented Generation (RAG) is a technique that enhances LLMs by supplementing their knowledge with external data. While LLMs can reason about a wide range of topics, their knowledge is limited to the data they were trained on up to a specific point in time.

RAG addresses this by retrieving relevant information from external sources and integrating it into the model's input, allowing AI applications to leverage both the LLM's reasoning capabilities and specific, up-to-date information for more accurate and relevant responses.

Our tools of choice:

Portkey: An end-to-end LLM Ops platform that accelerates teams from POC to production 10x faster. It provides:
- An AI Gateway for managing LLM requests
- A comprehensive LLM observability suite
- Robust guardrails for responsible AI development
MongoDB Atlas: A developer data platform offering cloud database and data services with native vector search capabilities. It provides:
- A flexible and intuitive document model
- Automated deployments and simple configuration changes
- Continuous feature improvements
- Native vector search capabilities
LlamaIndex: A data framework that simplifies the ingestion and indexing of custom data for LLMs.
Patronus AI: An evaluation platform to score and benchmark LLM performance on real world scenarios, generate adversarial test cases at scale, monitor hallucinations and other unexpected and unsafe behavior, and more.
Patronus excels in industry-specific guardrails for RAG workflows. Portkey integrates with multiple Patronus evaluators to help you enforce LLM behavior.

Setting up the Environment

Let's walk through the process of building a basic RAG app using these tools:

1. Set up MongoDB Atlas:

MongoDB offers a forever free Atlas cluster in the public cloud. Follow this tutorial to set up a MongoDB account and create a new cluster, or you can register here.

2. Setup Portkey

Install Portkey using pip:

pip install portkey-ai

Building the RAG Pipeline

1. Data Preparation

We'll use McDonald's SEC 10-K filing as our dataset. First, download the PDF, then use LlamaIndex's SimpleDirectoryReader to load and parse it:

💡

For demonstration purposes, we'll focus on McDonald's SEC 10-K filing, but the same approach can be applied to a broader set of documents.

2. Configuring the Document Store

Connect to your MongoDB instance and create a document store. After you've set up a MongoDB cluster, you can create a MongoDB document store by connecting to your MongoDB instance. This will generate a default database named 'db_docstore'.

This is what your document looks like in MongoDB Atlas:

3. Configure Portkey:

from llama_index.core import Settings
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL


llm = OpenAI(
    model="gpt-3.5-turbo",
    api_key="YOUR_OPENAI_API KEY",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="openai",
        api_key="YOUR_PORTKEY_API_KEY",
    )
)

4. Querying the Retrieval System

Let’s try our user queries with our newly constructed RAG system. First, you need to load a vector store from MongoDB documents:

Now you can run queries against your RAG system:

The Challenges with RAG Apps

RAG has revolutionized how we interact with LLMs, however, they come with their own set of challenges:

1. Hallucinations : LLMs can sometimes generate false or inconsistent information, especially when the retrieved context is insufficient or irrelevant.

2. Lack of Observability : Without proper tracing and monitoring, it's difficult to understand how the RAG pipeline is performing and where improvements are needed.

Enhancing the RAG App with Portkey Guardrails

Portkey integrates with leading AI guardrails platforms like Patronus AI to give you full-stack observability over your RAG apps and detect and tackle hallucinations.

1) On the Portkey "Integrations" page, add your Patronus API key, and select the "Retrieval Answer Relevance" guardrail from the Guardrail options.

2) Configure your Portkey config to use the above Patronus guardrail:

Portkey_config = {
	"retry": {
		"attempts": 3
	},
	"cache": {
		"mode": "simple"
	},
	"virtual_key":"openai_key_1",
	
	"after_request_hooks": [{
		"id": "guardrail_id_patronas"
	}]
}

3) Update the Llamaindex LLM with this configuration:

llm = OpenAI(
    model="gpt-3.5-turbo",
    api_key="YOUR_OPENAI_API KEY",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="openai",
        api_key="YOUR_PORTKEY_API_KEY",
        config="Portkey_config"
    )
)

Leveraging Observability

All your requests AND their guardrail verdicts + actions taken are now available on the Portkey dashboard.

Portkey's Dashboard

The Guardrail verdict for Retrieval Answer Relevancy will also be available for all of your requests:

Guardrail verdicts over multitude of your requests will help significantly improve your RAG app’s performance!

Next Steps for Improvement

Implement robust error handling: Use Portkey's gateway features like fallbacks, retry, load balancing features to mitigate failure scenarios. Also implement graceful fallbacks for when the database is unavailable
Optimize performance: Experiment with different embedding models for better vector representations. And, implement caching strategies for frequent queries
Consider scaling strategies: Implement connection pooling for MongoDB. Use asynchronous programming for handling multiple requests.

Conclusion

In this tutorial, we've built a robust RAG application using Portkey, MongoDB Atlas, and LlamaIndex in just 10 minutes. By leveraging Portkey's AI Gateway, observability suite, and guardrails, we've created a production-ready system that can reliably answer questions based on custom data.

The combination of Portkey's LLM management capabilities and MongoDB's flexible document model with vector search provides a powerful foundation for building advanced AI applications.

Ready to take your RAG applications to the next level? Join our community of AI practitioners to share experiences, get support, and stay updated on the latest developments in RAG.