DSPy
Integrate DSPy with Portkey for production-ready LLM pipelines
DSPy is a framework for algorithmically optimizing language model prompts and weights.
Portkey’s integration with DSPy makes your DSPy pipelines production-ready with detailed insights on costs & performance metrics for each run, and also makes your existing DSPy code work across 250+ LLMs.
Getting Started
Installation
Setting up
Portkey extends the existing OpenAI
client in DSPy and makes it work with 250+ LLMs and gives you detailed cost insights. Just change api_base
and add Portkey related headers in the default_headers
param.
Grab your Portkey API key from here.
🎉 Voila! that’s all you need to do integrate Portkey with DSPy. Let’s try making our first request.
Let’s make your first Request
Here’s a simple Google Colab notebook that demonstrates DSPy with Portkey integration
When you make a request using Portkey with DSPy, you can view detailed information about the request in the Portkey dashboard. Here’s what you’ll see:
Request Details
: Information about the specific request, including the model used, input, and output.Metrics
: Performance metrics such as latency, token usage, and cost.Logs
: Detailed logs of the request, including any errors or warnings.Traces
: A visual representation of the request flow, especially useful for complex DSPy modules.
Portkey Features with DSPy
1. Interoperability
Portkey’s Unified API enables you to easily switch between 250+ language models. This includes the LLMs that are not natively integrated with DSPy. Here’s how you can modify your DSPy setup to use Claude from Gpt-4 model:
2. Logs and Traces
Portkey provides detailed tracing for each request. This is especially useful for complex DSPy modules with multiple LLM calls. You can view these traces in the Portkey dashboard to understand the flow of your DSPy application.
3. Metrics
Portkey’s Observability suite helps you track key metrics like cost and token usage, which is crucial for managing the high cost of DSPy. The observability dashboard helps you track 40+ key metrics, giving you detailed insights into your DSPy run.
4. Caching
Caching can significantly reduce these costs by storing frequently used data and responses. While DSPy has built-in simple caching, Portkey also offers advanced semantic caching to help you save more time and money.
Just modify your Portkey config as shown below and pass it with the config
key in the default_headers
param:
5. Reliability
Portkey offers built-in fallbacks between different LLMs or providers, load-balancing across multiple instances or API keys, and implementing automatic retries and request timeouts. This makes your DSPy more reliable and resilient.
Similar to caching example above, just define your Config and pass it with the Config
key in the default_headers
param.
6. Virtual Keys
Securely store your LLM API keys in Portkey vault and get a disposable virtual key with custom budget limits.
Add your API key in Portkey UI here to get a virtual key, and pass it in your request like this:
Advanced Examples
Retrieval-Augmented Generation (RAG) system
Make your RAG prompts better with Portkey x DSPy
Troubleshoot - Missing LLM Calls in Traces
Missing LLM Calls in Traces
Missing LLM Calls in Traces
DSPy uses caching for LLM calls by default, which means repeated identical requests won’t generate new API calls or new traces in Langtrace. To ensure you capture every LLM call, follow these steps:
- Disable Caching: For full tracing during debugging, turn off DSPy’s caching. Check the DSPy documentation for detailed instructions on how to disable caching.
- Use Unique Inputs: To test effectively, make sure each run uses different inputs to avoid triggering the cache.
- Clear the Cache: If you need to test the same inputs again, clear DSPy’s cache between runs to ensure fresh API requests.
- Verify Configuration: Confirm that your DSPy setup is correctly configured to use the intended LLM provider.
If you still face issues after following these steps, please reach out to our support team for additional help.
Remember to manage caching wisely in production to strike the right balance between thorough tracing and performance efficiency.