Skip to main content

Features

Universal API

Use any of the supported models with a universal API (REST and SDKs)

Cache (Simple & Semantic)

Save costs and decrease latencies by using a cache

MCP Support

Connect to Remote MCP severs, allowing you to connect external tools and data sources.

Fallbacks

Fallback between providers and models for resilience

Conditional Routing

Route to different targets based on custom conditional checks

Multimodality

Use vision, audio, image generation, and more models

Automatic Retries

Setup automatic retry strategies

Circuit Breaker

Configure per-strategy circuit protection and failure handling

Load Balancing

Load balance between various API Keys to counter rate-limits

Canary Testing

Canary test new models in production

gRPC (Beta)

Use gRPC transport for lower latency and efficient binary serialization

Request Timeout

Easily handle unresponsive LLM requests

Budget Limits

Set usage limits based on costs incurred or tokens used

Rate Limits

Set hourly, daily, or per minute rate limits on requests or tokens sent

Using the Gateway

The various gateway strategies are implemented using Gateway configs. You can read more about configs below.

Configs

Open Source

We’ve open sourced our battle-tested AI gateway to the community. You can run it locally with a single command:
npx @portkey-ai/gateway
Contribute here. While you’re here, why not give us a star? It helps us a lot!
You can also self-host the gateway and then connect it to Portkey. Please reach out on [email protected] and we’ll help you set this up!
Last modified on February 13, 2026