Executive Summary
Guardrails GA Release | Production-ready guardrails to enforce LLM behavior in real-time, with support for PII detection, moderation, and more — are now generally available. (Docs) |
Enterprise Momentum | Refreshed Portkey’s enterprise offering with enhanced security features, and support for AWS Assume Role Auth. Also onboarded one of the world’s largest tech companies to Portkey. |
Provider Ecosystem | Added 7 new providers including vLLM, Triton, Lambda Labs, and more. |
Image Generation | Added support for Stable Diffusion v3 and Google Imagen. |
Integrations | Added MindsDB, ToolJet, LibreChat, and OpenWebUI. |
Prompt Caching | Anthropic’s prompt caching feature is now available directly in prompt playground. (Docs) |
.NET | You can now integrate Portkey with your .NET app |
Agent Tooling Leadership | Portkey was recognized for providing 11 critical capabilities for production-grade AI agents, leading the Agent Ops tooling benchmark. |
Featured Coverage | Our DevOps for AI vision featured in the People+AI Newsletter and Pulse2 publication. |
Features
- AWS Assume Role Support: Enhanced Bedrock authentication for enterprise security (Docs)
- User Management API: New API to resend user invites (Docs). Also updated the API specs for Prompt Completions API, Prompt Render API, and Insert Log API
- New OpenAI Param: OpenAI’s
max_completion_tokens
is now supported - Caching: Improved cost calculations for OpenAI & Azure OpenAI cached responses, and Anthropic’s prompt caching feature is now available directly in prompt playground
- Gemini Updates: Added support for Gemini JSON mode and Controlled Generations along with Pydantic support
- Bedrock: Integrated Converse API for
/chat/completions
. (Docs) - Enterprise: Refreshed Portkey’s enterprise offering with enhanced security features.
- C# (.NET) Support: You can now integrate Portkey in your .NET apps using the OpenAI official library. (Docs)
Models & Providers
7 New Providers: Expanding your model hosting and deployment options. 2 Image Generation Models: Strengthening our multimodal capabilities with next-gen image models.Stable Diffusion v3
Now available across Stability AI, Fireworks, AWS Bedrock, and Segmind
Imagen on Google Vertex
Official support for Google’s Imagen model through Vertex AI
Llama 3.2
Now integrated with Fireworks, AWS Bedrock, Groq, and Together AI
Vertex Embeddings
Added support for both
English
and Multilingual
embedding models from Google Vertex AIIntegrations
Model Management & Monitoring: Enhance your AI infrastructure with enterprise-grade observability.LibreChat
You can now track costs per user on your LibreChat instance by forwarding unique user IDs from LibreChat to Portkey - thanks to Tim’s contribution!
OpenWebUI
Portkey is the only plugin you’ll need for model management, cost tracking, observability, metadata logging, and more for your Open WebUI instance.
MindsDB
Connect your databases, vector stores, and apps to 250+ LLMs with enterprise-grade monitoring and reliability built-in.
ToolJet
Add AI-powered capabilities such as chat completions and automations into your ToolJet apps easily.
Guardrails
The guardrails feature is now generally available - it brings production-ready content filtering and response validation to your LLM apps. Updated Content Safety Guardrails:PII Detection
Detect sensitive personal information in user messages
Content Moderation
Automated content filtering and moderation
Language Detection
Automatically detect and validate response languages
Gibberish Detection
Filter out nonsensical or low-quality responses
Custom Webhooks
Metadata sent to the Portkey API will now be automatically forwarded to your custom webhook endpoint.
Lowercase Detection
Check if the given string is lowercase or not.
Resources
Quick Implementation Guides:- Guide to Prompt Caching: Learn how to optimize your LLM costs
- Production Apps with Vercel: Learn how to build prod-ready apps using Vercel AI SDK
- OpenAI Swarm Cheat Sheet: Learn how OpenAI’s new Swarm framework really works
OpenAI Swarm + Portkey
Build and secure multi-agent AI systems using OpenAI Swarm and Portkey
RAG with Observability
Enhanced version of Anthropic’s RAG Cookbook with unified API and monitoring
- Automated Prompt Engineering: Scale your prompt engineering workflow
- OpenAI’s Prompt Caching: Optimize costs and performance
- Complete Prompt Engineering Guide: Best practices and patterns
- OpenTelemetry Guide: Real-time observability for AI systems
Fixes
Model & Provider Enhancements Fixed core provider issues and improved reliability:- Enhanced streaming transformer for Perplexity
- Fixed response transformation for Ollama
- ⭐️ Added missing logprob mapping for Azure OpenAI (Thanks Avishkar!)
- Fixed token counting for Vertex embeddings (now using tokens instead of characters)
- Added support for Bedrock cross-region model IDs with pricing
- Fixed media file handling for Vertex AI & Gemini
- Fireworks:
accounts/fireworks/models/llama-v3p1-405b-instruct
- Together AI:
meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo
- Gemini:
gemini-1.5-pro
- Added support for
anthropic-beta
andanthropic-version
headers in the Portkey API - In Portkey SDK, the Portkey API key is now optional when you’re calling the self-hosted Gateway
- Enhanced support for custom provider headers in SDK
Community Updates
Upcoming EventsLLMs in Prod Dinner Singapore
Join top tech leaders for a closed-door dinner around OpenAI Dev Day. Register here
- Our DevOps for AI vision was featured in the People+AI Newsletter and Pulse2 publication.
- Portkey was recognized for providing 11 critical capabilities for production-grade AI agents.