Enterprise Gateway - Portkey Docs

Schedule Call

Discuss how Portkey’s AI Gateway can enhance your organization’s AI infrastructure

1.15.8

2025-10-07

v1.15.8

Provider Updates

Anthropic:
- Fixed count_tokens endpoint mapping
AWS Bedrock:
- Added empty usage object in message_start event for unified messages endpoint stream response to ensure compliance with Anthropic and avoid Claude Code errors

Fixes and Improvements

Added handling for empty span ID in OTel log transformation
Added support for fetching base model from inference profile in unified batches output handler

1.15.7

2025-10-03

v1.15.7

Provider Updates

AWS Bedrock:
- Fixed cache_control parameter mapping for tools in /messages endpoint.
- Fixed streaming response handling for tool_use, thinking, and redacted_thinking content_block_start events in /messages endpoint.

1.15.6

2025-10-02

v1.15.6

Features

Support pricing for image editing models, currently supported for providers: openai, azure-openai, and azure-foundry
Support input_audio parameter for vertex provider.
Allow pass through parameters for bedrock batch create endpoint.

Fixes

Handle batch fetch errors gracefully for batch output endpoint.

1.15.5

2025-09-30

v1.15.5

OTEL Exporter Enhancements

Added support for http/protobuf transport protocol.
Set OTEL_EXPORTER_OTLP_PROTOCOL environment variable to http/protobuf to use this.

Fixes

Prisma AIRS Guardrail: Fixed an issue with guardrail registration during initialization.

1.15.4

2025-09-25

v1.15.4

Provider Updates

OpenAI and Azure-OpenAI:
- Fixed parallel_tool_calls parameter mapping for /responses API.
- Added support for additional parameters for /responses API:: max_tool_calls, safety_identifier and top_logprobs

1.15.3

2025-09-19

v1.15.3

Provider Updates

Vertex AI: Handle empty responses returned by the provider
AWS Bedrock:
- Added support for APAC cross region inference profiles
- Added support for performance_config parameter which will be passed as-is to the provider as performanceConfig parameter
Azure Foundry and Github: Updated the parameter mapping to support all the latest OpenAI compatible chat completions parameters
OpenAI and Azure-OpenAI: Updated the tokenizer to support streaming request token calculation for latest gpt-5 models

1.15.2

2025-09-16

v1.15.2

New Features

KMS Support for file uploads to bedrock/AWS.
Support custom scope for entra auth to use with deprecated azure serverless models.
Custom Header support for OTEL Export of analytics data

Improvements and Fixes

Support Inference Profiles when uploading files to Bedrock for batches & finetuning.
Vertex global region support.
Cache cleanup for azure entra and managed identity authentication modes.
Wait for upstream websocket to be connected for Realtime APIs.

1.15.1

2025-09-09

v1.15.1

Improvements and Fixes

Fixed incorrect Portkey 429 error for token-based rate limiting when used with passthrough requests

1.15.0

2025-09-04

v1.15.0

Conditional Router Enhancements

Conditional router config strategy now supports conditions on request path
Documentation Link

Unified finish_reason

Unified finish_reason across all providers. By default, the value is mapped to an OpenAI-compatible value. If x-portkey-strict-openai-compliance is set to false, the original provider-returned value is retained

Gemini 2.5 Flash Image Model

Gemini 2.5 Flash Image model is now supported
Documentation Link

Unified Count Tokens Endpoint

Introduced unified endpoint for counting tokens across AWS Bedrock, Vertex AI, and Anthropic

Metadata-Based Model Access Guardrail

Introduced a new guardrail to restrict model access based on metadata key-value pairs

New Base Providers

Meshy
Tripo3D

Provider Updates

Vertex AI:
- Added support for Mistral models
- Added support for task_type and dimensions parameters in Vertex AI batch embeddings
AWS Bedrock: Added video support in chat completions

1.14.4

2025-08-29

v1.14.4

Improvements and Fixes

Resolved Authorization header conflict for passthrough requests. This issue occurred when the Portkey API key was sent in the Authorization header instead of the x-portkey-api-key header
Minor bug fixes for the Azure Foundry provider (azure-ai) in Entra and managed auth modes. Note that this does not affect the Azure OpenAI provider (azure-openai)

1.14.3

2025-08-28

v1.14.3

Improvements and Fixes

Fixed backward compatibility issue for the models endpoint when used with default configs. The new models endpoint will only be used when the incoming request does not have a provider, virtual_key, or a config (default or explicitly sent) with any of these fields. Otherwise, the request will be proxied to the upstream provider endpoint (same as the old flow)

1.14.2

2025-08-26

v1.14.2

Regex Replace Guardrail

Added a new guardrail that can replace regex patterns with a specified string

1.14.1

2025-08-19

v1.14.1

Provider Updates

Anthropic: Handle tool index when multiple tools are returned in streaming response
OpenAI and Azure OpenAI: Updated stream handling to log newly introduced fields of the usage object
AWS Bedrock: Fixed token calculation error for Bedrock messages response when cache tokens were returned by the provider

Improvements and Fixes

Updated pricing configurations for multiple providers and models
Fixed an issue where metadata labels for Prometheus metrics were getting dropped

1.14.0

2025-08-15

v1.14.0

Performance Improvements

Multiple performance improvements including:
- Removed redundant JSON operations
- Upgraded Hono framework for better performance
- Removed redundant LLM cache key creation

Provider Updates

DashScope: Updated the supported parameters
Vertex AI: Added timeRangeFilter support for Google Search tool
Fireworks:
- Handle non-ASCII characters in file upload
- Removed unnecessary response transforms to reduce processing time
OpenAI and Azure OpenAI: Added new parameters for GPT-5 compatibility
OpenRouter: Return reasoning messages, if returned by the model

Improvements and Fixes

Return the correct timeout value in webhook guardrail response

1.13.3

2025-08-13

v1.13.3

Improvements and Fixes

Allow Portkey API key in Authorization header for the unified models endpoint

1.13.2

2025-08-12

v1.13.2

Unified Models Endpoint

Released unified models API which follows OpenAI API specification to list all available models that can be used through Portkey
Documentation Link

1.13.1

2025-08-06

v1.13.1

Analytics Enhancements

Added new analytics data point to capture granular processing time for requests

OpenAI Chat Completions Improvements

Improvements to eliminate extra processing time for OpenAI chat completions responses

1.13.0

2025-08-04

v1.13.0

Unified Messages API

Released unified messages API which follows Anthropic’s messages API specification
Available for AWS Bedrock, Anthropic, and Vertex AI models
Documentation Link

1.12.0...1.12.5

2025-08-02

v1.12.0…v1.12.5

NOTE:

All builds between v1.12.0 and v1.12.5 were part of the Model Catalog early rollout

Model Catalog

Released Model Catalog support along with the latest API specification updates
Documentation Link

Workspace Budget Limits

You can now enforce budget and rate limits at the workspace level
Documentation Link

Circuit Breaker

Introduced circuit breakers which can be added per-strategy in configurations
Documentation Link

Automatic User Attribution

Introduced automatic _user metadata attribution when User API keys are used

Provider Updates

DeepSeek: Added support for response_format parameter

New Base Providers

Qdrant
DashScope

Improvements and Fixes

Removed headers from webhook guardrail response to avoid returning sensitive details

1.11.11

2025-07-26

v1.11.11

Replication for Analytics Data

Added support for analytics data replication
This is only applicable for air-gapped deployments

1.11.10

2025-07-22

v1.11.10

Provider Updates

Fireworks: Added support for prompt_cache_max_len parameter

Improvements and Fixes

Enhancements for unified batch output handling

1.11.9

2025-07-10

v1.11.9

S3 Log Store Enhancements

Added support for Object Lock and Retention-enabled buckets
Required environment variable: LOG_STORE_OBJECT_LOCK_RETENTION_ENABLED="true"

Unified Finish Reason

Unified finish_reason values across Anthropic and Bedrock models
If x-portkey-strict-openai-compliance is set to false, the provider-returned value will be retained

Provider Updates

Vertex AI: Added support for cost calculation for fine-tuned models
AWS Bedrock:
- Added support for Computer Use tool for Anthropic models
- Handle backward compatibility for Titan G1 embeddings model encoding_format parameter
- Fixed token calculation for Bedrock cache read and write tokens
Cohere: Handled null values for embeddings encoding_format parameter
Azure AI: max_completion_tokens parameter will now be forwarded as-is instead of being mapped to max_tokens
OpenAI and Azure OpenAI: Added support for web_search_options parameter for chat completions endpoint

Improvements and Fixes

Better handling for control plane synchronization when multiple Gateways are deployed

1.11.8

2025-06-28

v1.11.8

Unified Batches Improvements

Optimizations to handle large batch output files
Use usage object returned by Anthropic during batch output processing

Unified Finish Reason

Unified finish_reason values across Anthropic and Bedrock models
If x-portkey-strict-openai-compliance is set to false, the provider-returned value will be retained

Provider Updates

AWS Bedrock: Return finish_reason in error response

Improvements and Fixes

Updated pricing configurations for multiple providers and models
Better error logging for Gateway exceptions

1.11.7

2025-06-27

v1.11.7

Provider Updates

Azure OpenAI: Added Azure Managed Identity support for Azure Containers

Improvements and Fixes

Fixed an issue with Bedrock Signature calculation for passthrough requests
Better error logging for Azure Managed Identity errors

1.11.6

2025-06-21

v1.11.6

Provider Updates

Groq: Added support for service_tier parameter in the Groq provider configuration
Anthropic: Added support for Anthropic’s prompt caching for tool results and tool use
Anthropic: Fixed multi turn tool calling when arguments to the tool call is empty

Improvements and Fixes

Fixed an issue with Auth enabled Aws Redis Cache with Password and cluster mode
Handled Webhook Guardrail errors and return verdict with the correct status and error

1.11.5

2025-06-18

v1.11.5

Guardrails

Added support for metadata keys plugin to enforce metadata keys from the request.

1.11.4

2025-06-17

v1.11.4

Provider Updates

Bedrock: Added support for AssumedRole for bedrock application inference profiles
Bedrock Multimodal Embeddings: Added support for multimodal embeddings for providers cohere and titan.
Azure Foundry: Added support for createTranscription,createTranslation, imageGeneration, batch and files endpoints.
Anthropic: Added Support for computer use tool.
Anthropic: Added support for file_url and mime_type for file content parts in Anthropic requests.
VertexAI: Added support for Gemini/Vertex Thinking mode.

Cache Improvements

Added support for Azure Redis with auth modes EntraID and ManagedIdentity

Fixes And Improvements

Improvements for Redis Cache
- Added support for separate username and password for Redis Cache. Use REDIS_USERNAME and REDIS_PASSWORD environment variables.
- Added support for Azure Redis Cache. Use CACHE_STORE with azure-redis as value.
- Added support for Managed Identity for Azure Managed Redis.
  - You can pass AZURE_REDIS_AUTH_MODE and AZURE_REDIS_MANAGED_CLIENT_ID for a different auth setup.
  - Defaults to AZURE_AUTH_MODE and AZURE_MANAGED_CLIENT_ID if not provided
- Added support for Entra ID for Azure Redis Cache.
  - You can pass AZURE_REDIS_AUTH_MODE and AZURE_REDIS_ENTRA_CLIENT_ID, AZURE_REDIS_ENTRA_CLIENT_SECRET, AZURE_REDIS_ENTRA_TENANT_ID for a different auth setup.
  - Defaults to AZURE_AUTH_MODE and AZURE_ENTRA_CLIENT_ID, AZURE_ENTRA_CLIENT_SECRET, AZURE_ENTRA_TENANT_ID if not provided
HTTPS Proxy
- Added HTTPS Proxy support for all the external calls.
- Pass HTTPS_PROXY environment variable to enable this feature.
Added support for virtual key inclusion for custom log if passed in headers.
Fixed issue with proxy calls not working with configs for some providers.

1.11.3

2025-06-06

v1.11.3

Observability

Prometheus Metrics are migrated to use endpoints instead of path for all the metrics

Fixes And Improvements

Added a global error handler for all the unhandled exceptions to prevent server crashes.
Updated JWT Plugin to validate iat field

1.11.2

2025-06-03

v1.11.2

Fixes And Improvements

Fixed IRSA Web Identity token handling issue for Log Store that was introduced in v1.11.1

1.11.1

2025-06-02

v1.11.1

Provider Updates

OpenAI: Added support for background and service_tier parameters
Azure: Added support for custom hosts and private links for Azure Plugins
Bedrock: Added native support for inference profiles Ref

OTEL Traces Collector endpoints

Added new endpoint /v1/otel/v1/traces to collect any OTEL traces as Portkey traces

Log Exports available on Data Plane

Log exports are now available on Data Plane
Export logs without them being sent to Control Plane Docs Link
Please note that, Dataservice is required for log exports to work via Data Plane.

Fixes And Improvements

Fixed cache bug where Bedrock and Vertex requests were getting responses from the wrong models
Added support for fetching environment variables from mounted paths

1.11.0

2025-05-17

v1.11.0

Provider Updates

Bedrock: Fixed cache token calculation for streaming requests

Fixes And Improvements

Added mTLS support for Gateway to internal services in air-gapped deployments

1.10.23

2025-05-15

v1.10.23

Provider Updates

Vertex: Added support for dimensions parameter in multi-modal embeddings

Plugins

JWT Plugin: Added JWT authentication for runtime validation

Fixes And Improvements

Added pricing support for gpt-image-1 model

1.10.22

2025-05-08

v1.10.22

Enhanced Anthropic PDF Support

Added transformation logic to support OpenAI-spec-compatible file content parts in request.
Introduced two new Portkey parameters for the file content parts: file_url and mime_type.
Applicable for Anthropic, Bedrock-Anthropic and VertexAI-Anthropic models.
Docs

OTEL Configurations

Added two new environment variables:
- OTEL_SERVICE_NAME: Sets the service.name resource attribute value.
- OTEL_RESOURCE_ATTRIBUTES: Comma-separated key=value pairs which will be sent as individual resource attributes.

New Providers

Ncompass: Supports chat completions endpoint.
Lepton: Supports chat completions, completions and transcriptions endpoints.
Snowflake Cortex: Supports chat completions endpoint.

Provider Updates

Groq: Handled an exception that occurred when stream_options was included in the request, because the response transformer was not handling usage chunk mapping as expected.
Workers AI: Added support for /images/generations route.
Openrouter: Added support for usage request parameter and response mapping.
Deepinfra: Handled ping event returned in stream and mapped usage field returned in the response.
VertexAI: Now returns 400 (instead of 500) for empty model validation errors.

Fixes And Improvements

For providers except OpenAI and Azure-OpenAI, updated the value of object field in chat completions response from chat_completion to chat.completion for OpenAI spec compliance.
Proxy (Passthrough) Requests: Fixed endpoint construction logic, which was affecting a few provider-route combinations.
Unified Batches: Fixed issue where embeddings batch output was being returned as empty.
Prometheus: Fixed issue where model label was set as N/A for Bedrock requests.

1.10.21

2025-04-30

v1.10.21

AWS Bedrock Prompt Caching

Added support for AWS Bedrock prompt caching.
Docs Link

VertexAI Gemini 2.5 Thinking Param Support

Added support for the thinking settings parameters for VertexAI.

General Purpose File Upload For VertexAI

Documentation coming soon.

Provider Updates

Azure-OpenAI and Azure Foundry: Added cost calculation support for OpenAI finetuned models.
Bedrock:
- Handled tool role messages with empty content to avoid validation errors.
- Added response_format support for Deepseek partner models
VertexAI:
- Handled the unsupported $schema property in tools properties JSON Schema.
- Inference support for fine-tuned Gemini models.
Groq: Added support for translations, transcriptions and speech endpoints.

Fixes And Improvements

Fixed blocklist handling for Azure Content Safety guardrail.
Fixed Fireworks dataset upload validation error.

1.10.20

2025-04-23

v1.10.20

OpenAI Embeddings Latency Improvements

Improved response handling for OpenAI Embeddings resulting in significant reduction in response processing latency.

Strict Metadata Enforcement

Updated the preference for metadata logging. The new order is Workspace Default Metadata > API Key Default Metadata > Incoming Request Metadata.
This provides better control to organisation and workspace admins. Values set by admins cannot be overridden by request level metadata fields.

Strict Default Config Enforcement

Added support to disable default config override for API keys. If config override is not allowed and user tries to send a new config in request as well, Gateway will throw a 400 error.

Provider Updates

VertexAI: If a batch record failed on the provider’s end, the error will be retained in the final batch output file.
AzureOpenAI: Fixed URL path construction logic for non-completions requests like batches and files where an extra /v1 was getting added in the final URL, causing request failures.

Fixes And Improvements

Fixed an edge case where Batches, Files and Fine-tune endpoint threw an error when the passed config had targets field with a single virtual key in it.

1.10.19

2025-04-18

v1.10.19

OTEL Metrics Push

Added support for pushing Portkey Clickhouse analytics (traces and spans) to OTEL collector.
The following environment variables will be used to configure OTEL collector:
- OTEL_PUSH_ENABLED
- OTEL_ENDPOINT

Milvus Vector Store for Semantic Caching

Added support for Milvus vector store for semantic caching.
The following Vector stores are now supported:
- pinecone
- milvus
The following environment variables will be used to configure the Milvus vector store:
- VECTOR_STORE
- VECTOR_STORE_ADDRESS
- VECTOR_STORE_API_KEY
- VECTOR_STORE_COLLECTION_NAME

Azure Guardrails Support

Added support for Azure Content Safety API
Added support for PII detection with Azure Language Service

Prompts Render Endpoint

Prompts Render endpoint is now a part of the Gateway. It is available at /v1/prompts/:id/render.

Provider Updates

Vertex AI: Added support for dimensions for embeddings

Minor Enhancements

Prometheus Metric: Added portkey_processing_time_excluding_last_byte_ms metric which provides Portkey processing time excluding the LLM last byte diff latency (llm_last_byte_diff_duration_milliseconds).

1.10.18

2025-04-16

v1.10.18 (Redacted)

Redaction notice

This release introduced a critical bug in budget enforcement. We are redacting this release and will be releasing a patch with out Workspace Budget and related changes.

Workspace Level Usage and Rate Limits

Organisations can now enforce usage limits for each workspace
Organisations can now enforce rate limits for each workspace

OTEL Metrics Push

Added support for pushing Portkey Clickhouse analytics (traces and spans) to OTEL collector.
The following environment variables will be used to configure OTEL collector:
- OTEL_PUSH_ENABLED
- OTEL_ENDPOINT

Milvus Vector Store for Semantic Caching

Added support for Milvus vector store for semantic caching.
The following Vector stores are now supported:
- pinecone
- milvus
The following environment variables will be used to configure the Milvus vector store:
- VECTOR_STORE
- VECTOR_STORE_ADDRESS
- VECTOR_STORE_API_KEY
- VECTOR_STORE_COLLECTION_NAME

Azure Guardrails Support

Added support for Azure Content Safety API
Added support for PII detection with Azure Language Service

Provider Updates

Vertex AI: Added support for dimensions for embeddings

1.10.17

2025-04-11

v1.10.17

OpenAI and Azure OpenAI Response API

Added end-to-end support for the Response API.
Implemented caching for stream requests.
Introduced cost calculation for tools like web_search, file_search and code_execution.

Azure AI Foundry Enhancements

Updated the existing Azure Inference integration to directly accept endpoints from the Azure Foundry dashboard.

Retry Enhancements

Introduced a new retry setting use_retry_after_header. When set to true, if the provider returns the x-retry-after or x-retry-after-ms headers, Gateway will use these headers for retry wait times instead of applying the default exponential backoff for 429 responses.

Configurable Default Cache TTL

Default max cache TTL can now be set at the organisation level.

Provider Updates

Azure OpenAI: Added support for logprobs and top_logprobs request parameters.
Perplexity: Added support for response_format and search_recency_filter request parameters.
AWS Bedrock: Handled empty assistant tools messages containing only newline characters (\n\n).

Improvements

Gateway will now populate the model field in responses for the /chat/completions API if the providers do not natively return this field, ensuring alignment with the OpenAI signature.

1.10.16

2025-04-09

v1.10.16

Improvements

File Upload:
- Handle file upload failures for Bedrock in some scenarios
Unified Batch API:
- Return error_file_id content in the batch output for failed file uploads for OpenAI and Azure OpenAI providers.

1.10.15

2025-04-02

v1.10.15

Improvements

File Upload:
- Support for uploading large files to Providers and Data Service
Allow users to pass custom mime-types in the request body. For example:

    {
    "model": "gemini-1.5-pro",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant!"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "<image-url>",
                        "mime_type": "image/jpeg"
                    }
                }
            ]
        }
    ]
    }

1.10.14

2025-03-28

v1.10.14

Enforce Organisation And Workspace Guardrails

It is now possible to enforce guardrails at organisation and workspace levels, which will be applied to all requests.
Documentation: Workspace-Level Guardrails, Organisation-Level Guardrails

Unified Finetuning APIs for Fireworks

Extended the existing unified finetuning APIs to support Fireworks provider.

Pricing Updates

Added support for calculating Perplexity search cost and Gemini grounding cost.

Updated Unified API Signature For Anthropic Extended Thinking

Updated the unified API signature for Extended thinking which was introduced in v1.10.12 to ensure that OpenAI compliant field of the response remain untouched regardless of strict_open_ai_compliance flag.
More Details:

Unified Batches API Improvements

custom_id will be preserved in the VertexAI batch output.
Fixed some issues with batches cost calculation.

Logging Updates

Non-OpenAI compliant fields like groundingMetadata (Gemini Grounding), citations (Perplexity Search) and extended thinking response will now be logged for stream responses. Previously, these fields were not logged specifically for streaming response.

Provider Updates

Fireworks: Added support for logprobs and top_logprobs parameters.

Fixes and Improvements

Added new environment variable (AWS_ENDPOINT_DOMAIN) which can be used to override the default value (amazonaws.com)
Fixed an edge case where before_request_hook failures were not getting flagged with 246 response status code for cached and non-cached stream responses.

1.10.13

2025-03-20

v1.10.13

Unified Batches APIs for VertexAI Embedding

Added support for batch processing of embeddings with Vertex AI.

Provider Updates

AWS Bedrock
- Multi-Turn Conversation With Tools:
  - Handled assistant messages where content is set as null and tool_calls are passed.
OpenAI
- Fixed an edge case (introduced in the previous version) which was causing issues in cost calculation of fine-tuned models.

Fixes and Improvements

Fixed batch pricing calculation issue for VertexAI and Anthropic Bedrock models.
Fixed an edge case where the x-portkey-retry-attempt-count response header was set to -1 even when no retries were configured.
Improved handling to skip stream mode detection for irrelevant request types. For example: stream mode detection should not happen for any GET requests as it is not supported.
Removed redundant AWS credential fetch failures at boot time.

1.10.12

2025-03-18

v1.10.12

Real-Time Model Pricing Sync

Model pricing configs are no longer coupled with gateway builds.
For hybrid deployments, model pricing configs will be fetched from the control plane.

Unified API Signature For Anthropic Thinking

Introduced a unified API signature to support single-turn and multi-turn conversations with Anthropic Extended Reasoning across Anthropic, AWS Bedrock and VertexAI.
More Details:

Prometheus Metric Updates

Added a new metric (llm_last_byte_diff_duration_milliseconds) to track LLM last byte latency for chunked JSON responses.
Added a new label (stream) for all metrics. Possible values: 0/1

Guardrails Updates

AWS Bedrock: Added handling to flag regex patterns returned by the guardrail.

Provider Updates

Azure OpenAI: Mapped the correct model name from multi-deployment virtual keys.

Fixes and Improvements

Portkey 500s are now logged in the console for debugging.

Internal POD to POD HTTPS Support

Added support for internal POD to POD HTTPS communication.
This can be enabled by mounting a volume with certificate and key.
TLS_KEY_PATH and TLS_CERT_PATH environment variables will be used to fetch the certificate and key from the volume.

1.10.11

2025-03-06

v1.10.11

Provider Updates

AWS Bedrock
- Added support for encryption key usage when uploading files to S3.
VertexAI:
- Minor updates to streamline the unified spec for batches and fine-tune APIs.
- Updated pricing for gemini-2.0-flash-lite models.
- Added support for webm mimeType.
Openrouter
- Mapped the usage object for streaming responses.
Azure Inference
- Replaced extra-parameters: ignore with extra-parameters: drop due to deprecation by Azure.
OpenAI and Azure OpenAI
- Update pricing for GPT 4.5 models

1.10.10

2025-02-27

v1.10.10

Unified Finetuning APIs for VertexAI

Extended the existing unified finetuning APIs to support VertexAI.
The File-upload and transformations will be done according to the provider requirements.

Body Params Support in Conditional Router

Added support for using params to specify body fields in conditional router queries. Previously, only metadata-based routing was supported.

Streaming Cache Responses Optimization

Increased stream chunk content size from 1 token to 125 tokens for cached responses. This reduces the number of chunks significantly (e.g., 2000 tokens now stream in ~16 chunks instead of 2000 chunks).
Improved last chunk delivery time.
In addition to latency improvements, this update reduces unnecessary network overhead caused by the large number of chunks.

AWS IRSA-based Authentication Updates

Switched from the default global STS endpoint to regional STS endpoints (for Bedrock and S3 requests) to ensure proper token generation when the global STS is unavailable from the instance.

Provider Updates

Anthropic:
- Better error handling for error type stream chunks returned by the provider.
- Pricing updates for Claude 3.7 models across Anthropic, Bedrock and VertexAI.

1.10.9

2025-02-20

v1.10.9

Redis Cache Optimization

Updated cache implementation to avoid redundant Redis calls to improve overall performance.

VertexAI Service Account Token Caching

Implemented caching for Vertex service account token. Previously, tokens were being regenerated on every request despite having 1-hour validity.
This will reduce VertexAI request latency by 50-100ms per request.

Provider Updates

Google and VertexAI
- Handled tool call response parsing when there is one part tool call and one part text.
- Made the default/empty usage object compliant with OpenAI for streaming response.

1.10.8

2025-02-13

v1.10.8

Mutator Webhooks

The existing webhook plugin now has mutation capability.
This can be used for use-cases like BYO-PII redaction guardrail.

Configurable Timeouts for Guardrails

It is now possible to set timeout values for Guardrail execution. The current default value is 5 seconds.
timeout parameter can be used for all the guardrails that make a fetch call internally.
It is also possible to store this timeout value in control plane while creating/updating a Guardrail on UI.

Provider Updates

AzureOpenAI: Added support for stream_options parameter.

1.10.7

2025-02-10

v1.10.7

Fixes and Enhancements

Fix: Allow empty body in POST and PUT requests. Gateway was adding empty object as a default body for POST and PUT requests. This caused issues for APIs like POST assistants cancel or POST batches cancel where the upstream provider does not accept body at all.

1.10.6

2025-02-07

v1.10.6

Unified Batches APIs for AzureOpenAI

Extended the unified batches APIs to support AzureOpenAI batching.

Provider Updates

Deepseek Models: Added support for Deepseek models across multiple inference providers like Fireworks, Groq and Together.

Fixes and Enhancements

Chore: Allow budget exhausted user API keys to view logs. Control plane uses user API keys to fetch UI logs from the Gateway. Budget exhaustion of these keys should not have blocked logs view.

1.10.5

2025-02-06

v1.10.5

JWT Auth

Added support for JWT based authentication and authorization.
Customers can configure their JWKS endpoint or the JWKS JSON.

Unified Batches APIs for VertexAI

Extended the unified batches APIs to support VertexAI batching.

Provider Updates

Google and VertexAI: Updated the Grounding implementation to support their new API signatures. Docs Link
AWS Bedrock: Handle edge cases for AWS Bedrock file uploads.

Fixes and Enhancements

Logging: Added exception details like cause and name in logs for provider level fetch failures.
Caching: Enabled caching even when the debug flag is set to false.

1.10.4

2025-01-29

v1.10.4

PII Redaction Guardrails

Added PII Redaction Guardrails through multiple guardrail providers:
- Portkey Managed
- AWS Bedrock
- Pangea
- Patronus
- Promptfoo
If any entities were redacted from request/response, the guardrail result object in the final response will contain a flag named transformed set to true.

Request Metadata Logging Updates

Workspace metadata will now logged on individual request level.

New Providers

Replicate: Now supported for proxy (passthrough) requests.

Fixes and Enhancements

Guardrails: Added ability to override default guardrail credentials (stored in control plane) with custom credentials at runtime.

1.10.3

2025-01-23

v1.10.3

AzureOpenAI Unified Finetuning Support

Extended the unified finetuning APIs to support AzureOpenAI provider.

AWS Bedrock Guardrails

AWS Bedrock Guardrails are now supported for request/response checks.
Here is a short document which can be used to setup this with Portkey.

Virtual Keys for Custom Models/Providers

It is now possible to configure custom host and custom headers directly in the virtual keys.
If your custom model’s API signature matches any of our existing providers, you can create a virtual key with your custom settings.
While this functionality was already available, it has now been integrated directly into virtual keys for more streamlined configuration.

Prometheus Metric Updates

Updated the units for LLM request duration histogram metrics to milliseconds. The label has been renamed from llm_request_duration_seconds to llm_request_duration_milliseconds
Added a new metric named portkey_request_duration_milliseconds to track Portkey’s processing latency.

New Providers

Milvus DB: Supported as a passthrough provider.

Provider Updates

VertexAI and Google Improvements
- Added logprobs support compatible with OpenAI format via logprobs and top_logprobs parameters
- Added support for experimental Gemini Thinking Models.
- Added tool parameters JSON schema handling to ignore/skip fields which are not compatible with these 2 providers.
Anthropic: Added total_tokens in stream response to make it compliant with OpenAI spec.

1.10.2

2025-01-14

v1.10.2

Provider Updates

VertexAI: VertexAI requests that sent the virtual key and config headers separately were failing with a provider 401 error. This was happening specifically for VertexAI requests where the virtual key was sent as a separate header along with a config header.

1.10.1

2025-01-13

v1.10.1

Unified Finetune APIs

Added unified finetune APIs for OpenAI, AzureOpenAI, Bedrock and Fireworks.

Fixes and Enhancements

Code Detection Guardrail Updates: Added checks for verbose identifiers to detect python and js markdown code blocks. Example: check for python and javascript along with py and js identifiers.

1.10.0

2025-01-03

v1.10.0

Unified Batches and Files API

Added unified batching APIs for OpenAI, AWS Bedrock and Cohere
Docs Link

Improved Batch Management for Analytics Data Inserts

Improved Clickhouse batch management to prevent log drops.
Notable reduction in memory usage growth and spikes compared to previous builds.
We also recommend changing ANALYTICS_STORE env to control_plane (for hybrid deployments) so that batching/retries can managed by Portkey.

Gateway Docker Image Size Reduction:

Made some updates to the image build process, reducing the size (compressed) from ~275MB to ~75MB.

VertexAI Self-Deployed Models (a.k.a Endpoints in Vertex):

You can now use self-deployed models from VertexAI. This update also supports Vertex-Huggingface models.

Shorthand Format For Guardrails In Config:

Added input_guardrails and output_guardrails fields in config which accept array of guardrail slugs.

Guardrail Output Explanation

Guardrails responses now include an explanation property to clarify why checks passed or failed.
This property is currently only available for default checks.

OpenAI `developer` Role Support Across All Providers:

For OpenAI and AzureOpenAI, the role will be mapped as expected.
For other providers, the developer role is mapped to the system role (or its equivalent).

New Partner Guardrails

Mistral (mistral.moderateContent): Guard against different type of contents like hate_and_discrimination, violence_and_threats, etc.
Pangea (pange.textGuard): Guard against malicious content and other undesirable data.

Provider Updates

Cohere: Removed unsupported stream parameter from the Bedrock-Cohere integration.

Fixes and Enhancements

Image Cost Calculation: Updated the image calculation logic to handle different quality, size, etc. combinations.
ValidURL Guardrail: Updated the URL extraction logic to handle more edge cases.
Prompt Render Error Message: Prompt render API (/render) is a control plane API. Added detailed message to highlight this in case a user tries to use this API on their deployed Gateway.

1.9.5

2024-12-17

v1.9.5

Gemini Grounding Mode Support

Added Gemini grounding mode support in OpenAI compatible tools format.
Docs Link

Provider Updates

Groq: Fixed finish_reason mapping for streaming response.
AWS Bedrock: fixed the index mapping for tool call streaming response.
VertexAI: fixed final model param mapping for VertexAI Meta partner models.

Fixes and Enhancements

Proxy (Passthrough) Requests: fixed audio/* content-type passthrough request handling.

1.9.4

2024-12-11

v1.9.4

Enhanced Request/Response Logging

Added comprehensive logging for all request/response phases:
- Original request
- Transformed request
- Original response
- Transformed response

Prometheus Metrics Standardization

Standardized all Prometheus metric labels to use a consistent set:
- method
- route
- code
- custom_labels
- provider
- model
- source

Provider Updates

Ollama and Groq
- Added support for tools.

1.9.3

2024-12-06

v1.9.3

Allow All S3-compatible Log Stores

Added a new LOG_STORE type named S3_CUSTOM which can be used to integrate any S3-compatible storage service for request logging.
The custom host for the storage provider can be set in LOG_STORE_BASEPATH.

New Provider - AWS Sagemaker

AWS Sagemaker models can now be used through Gateway as passthrough requests.
Unified API signature is not yet possible because Sagemaker inherits the request body structure from the underlying model.
Docs Link

1.9.2

2024-11-29

v1.9.2

Proxy (Passthrough) Request Enhancements

Added streamlined support for virtual keys and configs in proxy (passthrough) requests.

Prompt Labels

Added support for labelled prompt cache invalidation whenever an update happens on control plane side.
NOTE: Prompt labels is a control plane change and has no major updates in Gateway apart from cache key invalidation for labelled prompt keys.
Docs Link

S3 Integration Enhancements

Allow sub-paths in bucket name for logs.

Provider Updates

Perplexity: Allow citations in response if strict_open_ai_compliance flag is set to false.
AWS Bedrock
- Stringify the response tool arguments to make it OpenAI compliant.
- Merge successive user messages to avoid Bedrock errors.
Openrouter: Handle cost calculation when input model is openrouter/auto.
Google: Fix the mapping for code in error response.

1.9.1

2024-11-25

v1.9.1

Provider Updates

OpenAI and AzureOpenAI
- For Realtime APIs, the socket close event now retains the original close reason returned by the provider.
- Added support for newly released prediction, store, metadata, audio and modalities parameters.
AWS Bedrock: Fixed an issue where an extra newline character was being returned in the AWS Bedrock response.

1.9.0

2024-11-20

v1.9.0

Dynamic Budgets and Auto Expiry for API Keys and Virtual Keys

Introduced support for setting dynamic budgets and auto-expiry for API keys and virtual keys.

Realtime API Integration

Added Realtime APIs integration for OpenAI and AzureOpenAI.
Docs Link

Provider Updates

VertexAI: Fixed structured outputs integration for VertexAI when using JS SDK. The SDK was adding extra fields in the JSON schema that were incompatible with Vertex’s API requirements.

1.8.4

2024-11-13

v1.8.4

Provider Updates

Azure OpenAI: Added encoding_format and dimensions as supported params.

Fixes & Enhancements

Updated the default behaviour to use IMDS/Service account role for Bedrock and S3.

1.8.3

2024-11-12

v1.8.3

Fixes & Enhancements

Fixed implementation conflicts of existing AWS AssumeRole implementation with the newly released IRSA (IAM Roles for Service Accounts) Assume Role and IMDS (Instance Metadata Service) Assume Role auth approaches.

1.8.2

2024-10-31

v1.8.2

Fixes and Enhancements

Added a new Prometheus metric to track LLM-only latency. Label name: llm_request_duration_seconds

1.8.1

2024-10-30

v1.8.1

Control Plane Log Store

Added a new log and analytics store named control_plane.
Setting LOG_STORE and ANALYTICS_STORE environment variables as control_plane will route all logs and analytics to the control plane and will eliminate the need of having Clickhouse connection on Gateway.

1.8.0

2024-10-25

v1.8.0

Bedrock Converse API integration

Bedrock’s /chat/completions have been updated to use Bedrock converse API.
This enables features like tool calls, vision, etc. for many bedrock models.
This also removes the hassle of maintaining chat templating logic for llama and mistral models.

VertexAI Image Generation

Added support for Vertex Imagen models.

Stable Diffusion v2 Models

StabilityAI introduced v2 models with a new API signature. Gateway now supports both v1 and v2 models, with internal transformations for different API signatures.
Supported for both stability-ai and bedrock providers.
New models: Stable Image Ultra, Core, 3.0 and 3.5.

Pydantic SDK Integration for Structured Outputs

Done for GoogleAI and VertexAI (follows OpenAI)
We previously added support for structured outputs through REST API. However, SDKs using Pydantic were not supported due to extra fields in the JSON schema.
Added a dereferencing function that converts JSON schemas from the library to Google-compatible schemas.

OpenAI and AzureOpenAI Prompt Cache Pricing

Added support for handling prompt caching pricing for required models.

New Providers

Lambda (lambda): Supports chat completions and completions.

Provider Updates

Perplexity: Added the missing [DONE] chunk for stream calls to comply with OpenAI’s spec.
VertexAI: Fixed provider name extraction logic for meta models, so users can send it like other partner models (e.g., meta.<model-name>).
Google: Added structured outputs support (similar to Vertex-ai).

Fixes & Enhancements:

Exclude files, batches, threads, etc. (all passthrough) from llm_cost_sum prometheus metric to avoid unnecessary labels.

Monthly Summary

Enterprise Releases

Product Releases

SDK Releases

Schedule Call

​v1.15.8

​Provider Updates

​Fixes and Improvements

​v1.15.7

​Provider Updates

​v1.15.6

​Features

​Fixes

​v1.15.5

​OTEL Exporter Enhancements

​Fixes

​v1.15.4

​Provider Updates

​v1.15.3

​Provider Updates

​v1.15.2

​New Features

​Improvements and Fixes

​v1.15.1

​Improvements and Fixes

​v1.15.0

​Conditional Router Enhancements

​Unified finish_reason

​Gemini 2.5 Flash Image Model

​Unified Count Tokens Endpoint

​Metadata-Based Model Access Guardrail

​New Base Providers

​Provider Updates

​v1.14.4

​Improvements and Fixes

​v1.14.3

​Improvements and Fixes

​v1.14.2

​Regex Replace Guardrail

​v1.14.1

​Provider Updates

​Improvements and Fixes

​v1.14.0

​Performance Improvements

​Provider Updates

​Improvements and Fixes

​v1.13.3

​Improvements and Fixes

​v1.13.2

​Unified Models Endpoint

​v1.13.1

​Analytics Enhancements

​OpenAI Chat Completions Improvements

​v1.13.0

​Unified Messages API

​v1.12.0…v1.12.5

​NOTE:

​Model Catalog

​Workspace Budget Limits

​Circuit Breaker

​Automatic User Attribution

​Provider Updates

​New Base Providers

​Improvements and Fixes

​v1.11.11

​Replication for Analytics Data

​v1.11.10

​Provider Updates

​Improvements and Fixes

​v1.11.9

​S3 Log Store Enhancements

​Unified Finish Reason

​Provider Updates

​Improvements and Fixes

​v1.11.8

​Unified Batches Improvements

​Unified Finish Reason

​Provider Updates

​Improvements and Fixes

​v1.11.7

v1.15.8

Provider Updates

Fixes and Improvements

v1.15.7

Provider Updates

v1.15.6

Features

Fixes

v1.15.5

OTEL Exporter Enhancements

Fixes

v1.15.4

Provider Updates

v1.15.3

Provider Updates

v1.15.2

New Features

Improvements and Fixes

v1.15.1

Improvements and Fixes

v1.15.0

Conditional Router Enhancements

Unified finish_reason

Gemini 2.5 Flash Image Model

Unified Count Tokens Endpoint

Metadata-Based Model Access Guardrail

New Base Providers

Provider Updates

v1.14.4

Improvements and Fixes

v1.14.3

Improvements and Fixes

v1.14.2

Regex Replace Guardrail

v1.14.1

Provider Updates

Improvements and Fixes

v1.14.0

Performance Improvements

Provider Updates

Improvements and Fixes

v1.13.3

Improvements and Fixes

v1.13.2

Unified Models Endpoint

v1.13.1

Analytics Enhancements

OpenAI Chat Completions Improvements

v1.13.0

Unified Messages API

v1.12.0…v1.12.5

NOTE:

Model Catalog

Workspace Budget Limits

Circuit Breaker

Automatic User Attribution

Provider Updates

New Base Providers

Improvements and Fixes

v1.11.11

Replication for Analytics Data

v1.11.10

Provider Updates

Improvements and Fixes

v1.11.9

S3 Log Store Enhancements

Unified Finish Reason

Provider Updates

Improvements and Fixes

v1.11.8

Unified Batches Improvements

Unified Finish Reason

Provider Updates

Improvements and Fixes

v1.11.7

Provider Updates

Improvements and Fixes

v1.11.6

Provider Updates

Improvements and Fixes