Area | Key Updates |
---|---|
Platform | • Prompt CRUD APIs • Export logs to your internal stack • Budget limits and rate limits on workspace • n8n integration • OpenAI Codex CLI integration • New retry setting to determine wait times • Milvus for Semantic Cache • Plugins moved to org-level Settings • Virtual Key exhaustion alert includes workspace • Workspace control setup option |
Gateway & Providers | • OpenAI embeddings latency improvement (200ms) • Responses API for OpenAI & Azure OpenAI • Bedrock prompt caching via unified API • Virtual keys for self-hosted models • Tool calling support for Groq, OpenRouter, and Ollama • New providers: Dashscope, Recraft AI, Replicate, Azure AI Foundry • Enhanced parameter support: Openrouter, Vertex AI, Perplexity, Bedrock • Claude’s anthropic_beta parameter for Computer use beta |
Technical Improvements | • Unified caching/logging of thinking responses • Strict metadata logging: Workspace > API Key > Request • Prompt render endpoint available on Gateway URL • API key default config now locked from overrides |
New Models & Integrations | • GPT-4.1 • Gemini 2.5 Pro and Flash • LLaMA 4 via Fireworks, Together, Groq • o1-pro • gpt-image-1 • Qwen 3 • Audio models via Groq |
Guardrails | • Azure AI Content Safety integration • Exa Online Search as a Guardrail |
use_retry_after_header
. When set to true, if the provider returns the retry-after
or retry-after-ms headers
, the Gateway will use these headers to determine retry wait times, rather than applying the default exponential backoff for 429 responses.anthropic_beta
parameter in Bedrock’s Anthropic API via Portkey to enable Claude’s Computer use beta feature.Workspace Default > API Key Default > Incoming Request
. This is provide better control to org admins and ensure values set by them are not overridden.![]() | ![]() |
---|