Run large‑scale inference jobs through one consistent endpoint
Mode | When to pick it | Works with |
---|---|---|
Provider Batch API | Cheapest for overnight or offline jobs. Uses the provider’s native batch endpoint & limits. | OpenAI , Azure OpenAI , Bedrock , Vertex , Fireworks |
Portkey Batch API | Fastest and provider‑agnostic. Batches at the Gateway layer; ideal when a provider has no native batch support or you need cross‑provider jobs. | Any provider supported by Portkey |
input_file_id
) - required only when using the Portkey Batch API (Mode #2). See Files to upload one.🔗 Full schema: see the OpenAPI reference.
Provider | Endpoints |
---|---|
OpenAI | completions , chat completions , embeddings |
Azure OpenAI | completions , chat completions , embeddings |
Bedrock | chat completions |
Vertex AI | chat completions , embeddings |
Fireworks | completions |
Property | Default | Notes |
---|---|---|
completion_window | 24h | Set by provider (cannot be shorter). |
Provider quota | Per provider | e.g., OpenAI ≤ 50k jobs/day. |
Retries | Provider‑defined | Portkey surfaces job status; no Gateway retry. |
completion_window
to immediate
and Portkey aggregates your requests in memory, then fires them to the target provider in fixed buckets.
Gateway default | Value |
---|---|
Batch size | 25 requests |
Batch interval | 5 s between flushes |
Retries | 3 per request (configurable via x-portkey-config ) |
batch_size
, batch_interval
, and max_retries
.
provider_job_id
is absent and cost is computed from individual calls.
GET /batches/<batch_id>/output
endpointLayer | What Portkey does | How to override |
---|---|---|
Gateway (Portkey Batch) | Retries 3× on network/429/5xx | x-portkey-config: {"retry": {"max_attempts": 5}} |
Provider (native batch) | Provider rules | Not configurable via Portkey |
batch.read
).Term | Meaning |
---|---|
Batch Job | A collection of completion requests executed asynchronously. |
Portkey File (input_file_id ) | Files uploaded to Portkey that are automatically uploaded to the provider for batch processing. Useful for reusing the same file across multiple batch completions. |
Virtual Key | A logical provider credential stored in Portkey; referenced by ID, not secret. |
Completion Window | Time frame in which the job must finish. immediate → handled by Portkey; 24h → delegated to provider. |
batch_size
, batch_interval
, max_retries
(Q3 2025)