Batches
Run batch inference with Portkey
Portkey supports batching requests in two ways:
- Batching requests with Provider’s batch API using unified api and provider file
- Custom batching with Portkey’s batch API
1. Batching requests directly with Provider’s batch API using unified api
Portkey supports batching requests directly with Provider’s batch API by using a unified api structure. The unified api structure is similar to OpenAI’s batch API. Please refer to individual provider’s documentation for more details.
2. Custom batching with Portkey’s batch API
Portkey supports custom batching with Portkey’s batch API in two ways.
- Batching requests with Provider’s batch API using Portkey’s file.
- Batching requests directly with Portkey’s gateway using Portkey’s file.
This is controlled by completion_window
parameter in the request.
- When
completion_window
is set to24h
, Portkey will batch requests with Provider’s batch API using Portkey’s file. - When
completion_window
is set toimmediate
, Portkey will batch requests directly with Portkey’s gateway.
Along with this, you have to set portkey_options
which helps Portkey to batch requests to Provider’s batch API or Gateway.
- This is achieved by using provider specific headers in the portkey_options
- For example, if you want to use OpenAI’s batch API, you can set
portkey_options
to{"x-portkey-virtual-key": "openai-virtual_key"}
- For example, if you want to use OpenAI’s batch API, you can set
2.1 Batching requests with Provider’s batch API using Portkey’s file
completion_window
needs to be set to 24h
and input_file_id
needs to be Portkey’s file id. Please refer to Portkey’s files for more details.
- Using Portkey’s file, you can upload your data to Portkey and reuse the content in your requests.
- Portkey will automatically upload the file to the Provider and reuse the content in your batch requests.
- Portkey will also automatically check the batch progress and does post batch request analysis including token and cost calculations.
- You can also get the batch output in the response using
GET /batches/<batch_id>/output
endpoint.
2.2 Batching requests directly with Portkey’s gateway
completion_window
needs to be set to immediate
and input_file_id
needs to be Portkey’s file id. Please refer to Portkey’s files for more details.
Using this method, you can batch requests to any provider, whether they support batching or not.
- Portkey will batch requests directly with Portkey’s gateway.
- Please note the following:
- Currently, Batch Size is set to 25 by default.
- Batch Interval / Reset is set to 5 seconds (Next batch runs after 5 seconds)
- We retry requests 3 times by default (can be extended retry using
x-portkey-config
header with retry) - We have plans to support custom batch size, batch interval and retry count in the future.
- Portkey will also automatically check the batch progress and does post batch request analysis including token and cost calculations.
immediate
batching doesnt support batch output in the response as all the requests are already processed by Portkey’s gateway.
Was this page helpful?