Skip to main content
This was an eventful month at Portkey. We hosted and joined incredible events, from community gatherings like LLMs in Prod in San Francisco, to enterprise conversations with Palo Alto Networks and PG&E. A highlight was seeing Syngenta’s teams use Portkey in their Devcon 2025 hackathon, experimenting with new AI workflows. Each of these moments reinforced how fast enterprise AI is evolving, and how important it is to build responsibly at scale. Alongside the events, we shipped some of our most anticipated product updates yet, including the MCP Gateway and a series of new guardrails, routing improvements, and provider integrations, all designed to make it easier for teams to run AI in production with security, governance, and flexibility built in.

Summary

AreaKey Highlights
Platform• MCP Gateway beta
• Unified token counting endpoint
Guardrails• Metadata-based model access guardrail
• Regex Replace Guardrail
Gateway & Providers• Unified finish_reason
• Conditional routing enhancement
• Provider updates and improvements
Community & Events• Palo Alto Networks webinar
• LLMs in Prod, San Francisco
• PG&E Immersion Day
• Syngenta Devcon 2025
• MCP Salon

Platform

MCP Gateway (beta)

Enterprises can now bring MCP servers and tools into production with governance, observability, and access controls built-in. The Gateway helps teams avoid authentication sprawl, shadow tool usage, and fragmented monitoring—giving enterprises a central way to manage how agents interact with external tools.
Key benefits include:
  • Centralized MCP tool access
  • Unified authentication and credentials handling
  • Complete visibility into agent-tool interactions
  • Access controls and usage limits by team
👉 To try out the beta, send us an email or book a demo.

Unified token counting endpoint

Token counting is now unified across AWS Bedrock, Vertex AI, and Anthropic. With a single endpoint, you can:
  • Estimate token usage before sending a request (avoid context limit errors)
  • Standardize cost/usage logic across providers
  • Enforce routing and quota rules directly inside your app
Read more here Unified Token Counting Endpoint This goes beyond dashboard reporting so developers can make smarter runtime decisions with accurate, provider-agnostic token counts.

Guardrails

Metadata-based Model Access guardrail

Metadata-based Model Access Guardrail We introduced a new guardrail that lets you restrict model access based on metadata key-value pairs at runtime. By evaluating metadata dynamically on every request, this guardrail provides granular, per-request governance—without requiring changes to your app logic.

Regex-replace guardrail

Regex Replace Guardrail Our PII guardrail already covers common fields like emails, phone numbers, and credit cards. But many customers asked for a way to redact custom org-specific patterns. With the Regex Replace Guardrail, you can now:
  • Define your own regex patterns
  • Replace matches with a chosen string (e.g., [masked_user])
  • Enforce masking rules at runtime, before data reaches the model
This is particularly useful for internal IDs, employee codes, or project references that shouldn’t leave your environment. Read more here.

Gateway and Providers

Unified finish_reason parameter

We standardized the finish_reason field across all providers. By default, values are mapped to OpenAI-compatible outputs, ensuring consistent handling across multi-provider deployments. If you prefer to keep the original provider-returned value, set x-portkey-strict-openai-compliance = false.

Conditional router enhancement

Conditional routing now supports parameter-based routing in addition to metadata. Parameter-based routing enables dynamic, per-request optimizations, giving you better performance, cost efficiency, and control over user experience. Read more about this here

Gateway and Providers

New Models

Claude Sonnet 4.5

Anthropic’s latest model, now available via Portkey

GPT-5 Codex

OpenAI’s specialized coding model, supported with full observability

New Providers

Meshy

Specialized 3D generation and design workflows provider

Tripo3D

Next-generation 3D modeling and visualization

Cerebras

High-performance AI inference provider offering up to 70x faster speeds than GPU-based solutions

Nextbit256

New provider focused on efficient inference for specialized workloads

Improvements and Fixes

  • DashScope → Updated supported parameters
  • Vertex AI:
    • Added timeRangeFilter support for Google Search tool
    • Added support for Mistral models
    • Added support for task_type and dimensions parameters in Vertex AI batch embeddings
    • Handle empty responses returned by the provider
    • Support for global region
  • Fireworks → Better handling of non-ASCII characters in file uploads, plus removed unnecessary response transforms for faster performance
  • OpenAI & Azure OpenAI:
    • Added new parameters for GPT-5 compatibility
    • Updated the tokenizer to support streaming request token calculation for latest gpt-5 models
  • AWS Bedrock:
    • Added video support in chat completions
    • Added support for APAC cross region inference profiles
    • Added support for performance_config parameter which will be passed as-is to the provider as performanceConfig parameter
    • Support Inference Profiles when uploading files for batches & finetuning
    • KMS Support for file uploads to AWS
  • Azure Foundry and Github → Updated the parameter mapping to support all the latest OpenAI compatible chat completions parameters
  • Adding new models to Azure OpenAI → Now much simpler. You just need to add the target URI, and the model will be fetched and added directly to your integration.
  • Azure Foundry → You can now enable multiple models in a single Azure Foundry integration. This simplifies management and makes it easier to experiment with different models under the same integration, without repetitive setup.
  • Support custom scope for entra auth to use with deprecated azure serverless models
  • Custom Header support for OTEL Export of analytics data
🚨 Vertex AI deprecation → Llama 3.1 + 3.2 models will be retired on Jan 15, 2026. Recommended migration: Llama 3.3 or Llama 4.

Community & Events

Protecting your AI platform with Palo Alto Networks and Portkey

We partnered with Palo Alto Networks for a joint webinar on securing enterprise AI platforms with guardrails + gateway. The session highlighted the most pressing AI security risks (prompt injections, data leakage, compliance gaps) and how combining a gateway with guardrails can address them without slowing down scale.

LLMs in Prod, San Francisco

PG&E Immersion Day

We had the privilege of joining Pacfic Gas & Electricity for their internal Immersion Day. It was a hands-on session with their teams, exploring how enterprises can adopt AI responsibly and scale across critical operations.

Syngenta Devcon 2025

Our customer Syngenta hosted Devcon 2025, centered on AI. devcon Syngenta’s teams ran a hackathon using Portkey + n8n, building creative workflows and experimenting with how MCP servers and digital apps can transform grower experiences. It was inspiring to see Portkey embedded directly into their innovation process, powering hands-on experimentation and ideation at scale.

MCP Salon

We hosted some of the most illustrious MCP builders for a closed-door roundtable. The group went deep into the technical challenges of building MCP servers and clients, serving them in production, and solving real-world adoption hurdles. 👉 To stay updated on upcoming events, subscribe to our event calendar

Resources

Community Contributors

A special thanks to our contributor this month:

Coming this month!

Webinar - LibreChat in Production Register here →

Support

I