Bring Your Own LLM
Integrate your privately hosted LLMs with Portkey for unified management, observability, and reliability.
Portkey’s Bring Your Own LLM feature allows you to seamlessly integrate privately hosted language models into your AI infrastructure. This powerful capability enables unified management of both private and commercial LLMs through a consistent interface while leveraging Portkey’s comprehensive suite of observability and reliability features.
Key Benefits
- Unified API Access: Manage private and commercial LLMs through a single, consistent interface
- Enhanced Reliability: Leverage Portkey’s fallbacks, retries, and load balancing for your private deployments
- Comprehensive Monitoring: Track performance, usage, and costs alongside your commercial LLM usage
- Simplified Access Control: Manage team-specific permissions and usage limits
- Secure Credential Management: Protect sensitive authentication details through Portkey’s secure vault
Integration Options
Prerequisites
Your private LLM must implement an API specification compatible with one of Portkey’s supported providers (e.g., OpenAI’s /chat/completions
, Anthropic’s /messages
, etc.).
Portkey offers two primary methods to integrate your private LLMs:
- Using Virtual Keys: Store your deployment details securely in Portkey’s vault
- Direct Integration: Pass deployment details in your requests without storing them
Option 1: Using Virtual Keys
Step 1: Add Your Deployment Details
Navigate to the Virtual Keys section in your Portkey dashboard and create a new Virtual Key.
Adding a private LLM as a Virtual Key
- Click “Add Key” and enable the “Local/Privately hosted provider” toggle
- Configure your deployment:
- Select the matching provider API specification (typically
OpenAI
) - Enter your model’s base URL in the
Custom Host
field - Add required authentication headers and their values
- Select the matching provider API specification (typically
- Click “Create” to generate your virtual key
Step 2: Use Your Virtual Key in Requests
After creating your virtual key, you can use it in your applications:
Option 2: Direct Integration Without Virtual Keys
If you prefer not to store your private LLM details in Portkey’s vault, you can pass them directly in your API requests:
The custom_host
must include the API version path (e.g., /v1/
). Portkey will automatically append the endpoint path (/chat/completions
, /completions
, or /embeddings
).
Securely Forwarding Sensitive Headers
For headers containing sensitive information that shouldn’t be logged or processed by Portkey, use the forward_headers
parameter to pass them directly to your private LLM:
In the JavaScript SDK, convert header names to camelCase. For example, X-My-Custom-Header
becomes xMyCustomHeader
.
In the JavaScript SDK, convert header names to camelCase. For example, X-My-Custom-Header
becomes xMyCustomHeader
.
In the Python SDK, convert header names to snake_case. For example, X-My-Custom-Header
becomes x_my_custom_header
.
Using Forward Headers in Gateway Configs
You can also specify forward_headers
in your Gateway Config for consistent header forwarding:
Advanced Features
Using Private LLMs with Gateway Configs
Private LLMs work seamlessly with all Portkey Gateway features. Some common use cases:
- Load Balancing: Distribute traffic across multiple private LLM instances
- Fallbacks: Set up automatic failover between private and commercial LLMs
- Conditional Routing: Route requests to different LLMs based on metadata
Learn more about Gateway Configs.
Monitoring and Analytics
Portkey provides comprehensive observability for your private LLM deployments, just like it does for commercial providers:
- Log Analysis: View detailed request and response logs
- Performance Metrics: Track latency, token usage, and error rates
- User Attribution: Associate requests with specific users via metadata
Portkey Analytics Dashboard for Private LLMs
Troubleshooting
Issue | Possible Causes | Solutions |
---|---|---|
Connection Errors | Incorrect URL, network issues, firewall rules | Verify URL format, check network connectivity, confirm firewall allows traffic |
Authentication Failures | Invalid credentials, incorrect header format | Check credentials, ensure headers are correctly formatted and forwarded |
Timeout Errors | LLM server overloaded, request too complex | Adjust timeout settings, implement load balancing, simplify requests |
Inconsistent Responses | Different model versions, configuration differences | Standardize model versions, document expected behavior differences |
FAQs
Next Steps
Explore these related resources to get the most out of your private LLM integration:
Was this page helpful?