Cancel Run

Cancels a run that is `in_progress`.

POSThttps://api.portkey.ai/v1/threads/{thread_id}/runs/{run_id}/cancel
Path parameters
thread_id*string

The ID of the thread to which this run belongs.

run_id*string

The ID of the run to cancel.

Response

OK

Body
id*string

The identifier, which can be referenced in API endpoints.

object*enum

The object type, which is always thread.run.

thread.run
created_at*integer

The Unix timestamp (in seconds) for when the run was created.

thread_id*string

The ID of the thread that was executed on as a part of this run.

assistant_id*string

The ID of the assistant used for execution of this run.

status*enum

The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, incomplete, or expired.

queuedin_progressrequires_actioncancellingcancelledfailedcompletedincompleteexpired
required_action*nullable object

Details on the action required to continue the run. Will be null if no action is required.

last_error*nullable object

The last error associated with this run. Will be null if there are no errors.

expires_at*nullable integer

The Unix timestamp (in seconds) for when the run will expire.

started_at*nullable integer

The Unix timestamp (in seconds) for when the run was started.

cancelled_at*nullable integer

The Unix timestamp (in seconds) for when the run was cancelled.

failed_at*nullable integer

The Unix timestamp (in seconds) for when the run failed.

completed_at*nullable integer

The Unix timestamp (in seconds) for when the run was completed.

incomplete_details*nullable object

Details on why the run is incomplete. Will be null if the run is not incomplete.

model*string

The model that the assistant used for this run.

instructions*string

The instructions that the assistant used for this run.

tools*array of one of

The list of tools that the assistant used for this run.

metadata*nullable object

Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

usage*nullable RunCompletionUsage (object)

Usage statistics related to the run. This value will be null if the run is not in a terminal state (i.e. in_progress, queued, etc.).

temperaturenullable number

The sampling temperature used for this run. If not set, defaults to 1.

top_pnullable number

The nucleus sampling value used for this run. If not set, defaults to 1.

max_prompt_tokens*nullable integer

The maximum number of prompt tokens specified to have been used over the course of the run.

max_completion_tokens*nullable integer

The maximum number of completion tokens specified to have been used over the course of the run.

truncation_strategy*Thread Truncation Controls

Controls for how a thread will be truncated prior to the run. Use this to control the intial context window of the run.

tool_choice*AssistantsApiToolChoiceOption (one of)

Controls which (if any) tool is called by the model. none means the model will not call any tools and instead generates a message. auto is the default value and means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools before responding to the user. Specifying a particular tool like {"type": "file_search"} or {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool.

parallel_tool_calls*ParallelToolCalls (boolean)

Whether to enable parallel function calling during tool use.

response_format*AssistantsApiResponseFormatOption (one of)

Specifies the format that the model must output. Compatible with GPT-4o, GPT-4 Turbo, and all GPT-3.5 Turbo models since gpt-3.5-turbo-1106.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

Response
{
  "id": "text",
  "object": "thread.run",
  "thread_id": "text",
  "assistant_id": "text",
  "status": "queued",
  "required_action": {
    "type": "submit_tool_outputs",
    "submit_tool_outputs": {
      "tool_calls": [
        {
          "id": "text",
          "type": "function",
          "function": {
            "name": "text",
            "arguments": "text"
          }
        }
      ]
    }
  },
  "last_error": {
    "code": "server_error",
    "message": "text"
  },
  "incomplete_details": {
    "reason": "max_completion_tokens"
  },
  "model": "text",
  "instructions": "text",
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "usage": {},
  "temperature": 0,
  "top_p": 0,
  "truncation_strategy": {
    "type": "auto"
  },
  "tool_choice": "none",
  "parallel_tool_calls": true,
  "response_format": "none"
}

Last updated