Kicking off 2025 with major releases! 🎉

January marks a milestone for Portkey with our first industry report — we analyzed over 2 trillion tokens flowing through Portkey to find out production patterns for LLMs.

We’re also expanding our platform capabilities with advanced PII redaction, JWT authentication, comprehensive audit logs, unified files & batches API, and support for private LLMs. Latest LLMs like Deepseek R1, OpenAI o3, and Gemini thinking model are also integrated with Portkey.

Plus, we are attending the AI Engineer Summit in New York in February, and hosting in-person meetups in Mumbai & NYC.

Let’s dive in!

Summary

AreaKey Updates
Benchmark• Released LLMs in Prod Report 2025 analyzing 2T+ tokens
• Key finding: Multi-LLM deployment is now standard
• Average prompt size up 4x, with 40% cost savings from caching
Security• Advanced PII redaction with automatic standardized identifiers
• JWT authentication support for enterprise deployments
• Comprehensive audit logs for all critical actions
• Enforced metadata schemas for better governance
• Attach default configs & metadata to API keys
• Granular workspace management controls
Platform• Unified API for files & batches across major providers
• Support for private LLM deployments
• Enhanced virtual keys with granular controls
New Models• Deepseek R1 available across 7+ providers
• Added Gemini thinking model
• Support for Perplexity Sonar models
• o3-mini integration
Integrations• AWS Bedrock Guardrails support
• Milvus DB & Replicate integrations
• Expanded Open WebUI support
• Guardrails for embedding requests
Community• We did a deep dive into MCP and event-driven architecture for agentic systems

Our comprehensive analysis of 2T+ tokens processed through Portkey’s Gateway reveals fascinating insights about how teams are deploying LLMs in production. Here are the key findings:

Multi-LLM is the New Normal

Despite OpenAI’s dominance (>50% of prod traffic), teams are actively implementing multi-LLM strategies for reliability and specialized use cases

Prompts are Getting Complex

Average prompt size has increased 4x in the last year, indicating more sophisticated engineering techniques and complex workloads

Caching is Critical

Implementation of proper caching strategies leads to up to 40% cost savings - a must-have for production deployments

Read the full LLMs in Prod 2025 Report →


Platform

Advanced PII Redaction

We’ve significantly enhanced Portkey’s Guardrails with request mutation capabilities.

When any sensitive data (like email, phone number, SSN) is detected in user requests, our PII redaction automatically replaces it with standardized identifiers before it reaches the LLM. This works seamlessly across our entire guardrails ecosystem, including AWS Bedrock Guardrails, Patronus AI, Promptfoo, Pangea, and more.

Unified Files & Batches API

Managing file uploads and batch processing across multiple LLM providers is now dramatically simpler. Instead of building provider-specific integrations, you can:

  • Upload once, use everywhere - test your data across different foundation models
  • Run A/B tests seamlessly across providers - Choose between native provider batching or Portkey’s custom batch API

Integrate Private LLMs

You can now add your privately hosted LLMs to Portkey’s virtual keys. Simply:

  • Configure your model’s base URL
  • Set required authentication headers
  • Start routing requests through our unified API

This means you can use your private deployments alongside commercial providers, with the same monitoring, reliability, and management features.

API Keys with Default Configs & Metadata

You can now attach default Portkey config & Metadata with any API key you create.

  • Automatically monitor how a service/user is consuming Portkey API by enforcing metadata
  • Apply Guardrails on requests automatically by adding them to Configs and attaching that to the key
  • Set default fallbacks for outgoing request

Enterprise

Running AI at scale requires robust security, visibility, and control. This month, we’ve launched a comprehensive set of enterprise features to enable that:

Authentication & Access Control

  • JWT Authentication: Secure API access with JWT tokens, with support for JWKS URL and custom claims validation.
  • Workspace Management: Manage workspace access and control who can view logs or create API keys from the Admin dashboard

Governance & Compliance

  • Metadata Schemas: Enforce standardized request metadata across teams - crucial for governance and cost allocation
  • Audit Logging: Track every critical action across both the Portkey app and Admin API, with detailed user attribution
  • Security Settings: Expanded settings for managing logs visibility and API key creation

Customer Love

After evaluating 17 different platforms, this AI team replaced 2+ years of homegrown tooling with Portkey Prompts.

They were able to do this because of three things:

  • They could build reusable prompts with our partial templates
  • Our versioning let them confidently roll out changes
  • And they didn’t have to refactor anything thanks to our OpenAI-compatible APIs

Integrations

Models & Providers

Deepseek R1

Access Deepseek’s latest reasoning model through multiple providers: direct API, Fireworks AI, Together AI, Openrouter, Groq, AWS Bedrock, Azure AI Inference, and more.

Libraries & Tools

Guardrails

Inverse Guardrail All eligible checks now have an Inverse option in the UI - which triggers a TRUE verdict when the Guardrail verdict fails.

AWS Bedrock Guardrails

Native support for AWS Bedrock’s guardrail capabilities

Promptfoo

Enhanced prompt testing capabilities with Promptfoo’s suite of evals

Guardrails on Embedding Requests Portkey Guardrails now work on your embedding input requests!

Community

We are attending the AI Engineer Summit in NYC this February and have some extra event passes to share! Reach out to us on Discord to ask for a pass.

We are also hosting small meetups in NYC and Mumbai this month to meet with local engineering leaders and ML/AI platform leads. Register for them below:

Resources

EDA for Agents

Last month we hosted an inspiring AI practitioners meetup with Ojasvi Yadav and Anudeep Yegireddi to discuss the role of Event-Driven Architecture in building Multi-Agent Systems using and MCP.

Read event report here →

Essential reading for your AI infrastructure:

Improvements

  • Token counting is now more accurate for Anthropic streams
  • Added logprobs for Vertex AI
  • Improved usage object mapping for Perplexity
  • Fixed some tricky cache behaviors
  • Error handling is more robust across all SDKs

Support