API Reference
- Introduction
- Authentication
- Headers
- Supported Providers
- SDKs
- API Details
Chat Completions
Portkey Prompts
Embeddings
Realtime
Other APIs
Completions
Moderations
Fine-tuning
Assistants
- Assistants
- Threads
- Messages
- Runs
- Run Steps
list Run
Path Parameters
The ID of the thread the run belongs to.
Query Parameters
A limit on the number of objects to be returned. Limit can range between 1 and 100, and the default is 20.
Sort order by the created_at
timestamp of the objects. asc
for ascending order and desc
for descending order.
asc
, desc
A cursor for use in pagination. after
is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include after=obj_foo in order to fetch the next page of the list.
A cursor for use in pagination. before
is an object ID that defines your place in the list. For instance, if you make a list request and receive 100 objects, ending with obj_foo, your subsequent call can include before=obj_foo in order to fetch the previous page of the list.
Response
The Unix timestamp (in seconds) for when the run was cancelled.
The Unix timestamp (in seconds) for when the run was completed.
The Unix timestamp (in seconds) for when the run was created.
The Unix timestamp (in seconds) for when the run will expire.
The Unix timestamp (in seconds) for when the run failed.
The identifier, which can be referenced in API endpoints.
Details on why the run is incomplete. Will be null
if the run is not incomplete.
The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run.
max_completion_tokens
, max_prompt_tokens
The last error associated with this run. Will be null
if there are no errors.
The maximum number of completion tokens specified to have been used over the course of the run.
x > 256
The maximum number of prompt tokens specified to have been used over the course of the run.
x > 256
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.
The object type, which is always thread.run
.
thread.run
Whether to enable parallel function calling during tool use.
Details on the action required to continue the run. Will be null
if no action is required.
Details on the tool outputs needed for this run to continue.
A list of the relevant tool calls.
The function definition.
The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the Submit tool outputs to run endpoint.
The type of tool call the output is required for. For now, this is always function
.
function
For now, this is always submit_tool_outputs
.
submit_tool_outputs
Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106
.
Setting to { "type": "json_object" }
enables JSON mode, which guarantees the message the model generates is valid JSON.
Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length"
, which indicates the generation exceeded max_tokens
or the conversation exceeded the max context length.
none
, auto
The Unix timestamp (in seconds) for when the run was started.
The status of the run, which can be either queued
, in_progress
, requires_action
, cancelling
, cancelled
, failed
, completed
, incomplete
, or expired
.
queued
, in_progress
, requires_action
, cancelling
, cancelled
, failed
, completed
, incomplete
, expired
Controls which (if any) tool is called by the model.
none
means the model will not call any tools and instead generates a message.
auto
is the default value and means the model can pick between generating a message or calling one or more tools.
required
means the model must call one or more tools before responding to the user.
Specifying a particular tool like {"type": "file_search"}
or {"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool.
none
, auto
, required
Controls for how a thread will be truncated prior to the run. Use this to control the intial context window of the run.
The truncation strategy to use for the thread. The default is auto
. If set to last_messages
, the thread will be truncated to the n most recent messages in the thread. When set to auto
, messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens
.
auto
, last_messages
The number of most recent messages from the thread when constructing the context for the run.
x > 1
Usage statistics related to the run. This value will be null
if the run is not in a terminal state (i.e. in_progress
, queued
, etc.).
The sampling temperature used for this run. If not set, defaults to 1.
The nucleus sampling value used for this run. If not set, defaults to 1.
Was this page helpful?