Skip to main content
Guardrails run at the Gateway layer β€” provider-agnostic, between your app and any upstream LLM.

Supported Endpoints

EndpointInputOutput
/v1/chat/completionsβœ“βœ“
/v1/completionsβœ“βœ“
/v1/embeddingsβœ“β€”
/v1/messages (Anthropic)βœ“βœ“
/v1/responsesβœ“βœ“
/v1/prompts/{promptId}/completionsβœ“βœ“
Embeddings have no model-generated output to evaluate, so only input_guardrails apply. See Guardrails for Embeddings.

Endpoints That Don’t Support Guardrails

Guardrails are only evaluated on LLM inference endpoints where Portkey can extract a text input and/or text output. The following Inference API endpoints do not run guardrails:
  • Assistants API (all endpoints): /v1/assistants/*, /v1/threads/*
  • Audio (all endpoints): /v1/audio/*
  • Images (all endpoints): /v1/images/*
  • Files (all endpoints): /v1/files/*
  • Batch (all endpoints): /v1/batches/*
  • Fine-tuning (all endpoints): /v1/fine_tuning/*
  • Moderations: /v1/moderations
  • Models: /v1/models
  • Prompt rendering (no model call): /v1/prompts/{promptId}/render
  • Response retrieval (no model call): GET /v1/responses/* (guardrails apply only to POST /v1/responses)

Sync vs. Async

ModeasyncBehaviorLatency
Async (default)trueRuns in parallel with the LLM call. Results are logged only β€” no effect on the response.None
SyncfalseRuns before forwarding the request (input) or before returning the response (output). Can deny or modify.Adds guardrail check latency
Use async to observe and log without blocking. Use sync when the guardrail result must gate the request.
With async=false, Portkey returns 246 (failed, but allowed) or 446 (failed, denied) instead of the standard provider status codes. See Guardrail behavior on the gateway.

Streaming

Guardrail TypeStreaming
input_guardrailsSupported β€” evaluated before the stream starts
output_guardrailsNot supported β€” requires the full response
With input guardrails on a streaming request: if the check passes, the stream proceeds normally. If it fails with deny=true, the stream is blocked with 446 before any tokens are sent. To receive hook_results inside streaming chunks, set:
x-portkey-strict-open-ai-compliance: false
hook_results appear in the first chunk for /chat/completions, or as a separate hook_results event for /messages.
For Anthropic /messages streaming, hook_results aren’t accessible via the Anthropic SDK. Use cURL or raw HTTP requests.
See Streaming Hook Results for SDK examples.

What Gets Evaluated

Text

Input guardrails evaluate the last message in the request. Output guardrails evaluate the generated text in the response.

Images

Guardrails are text-only. Image inputs (base64 or URLs in multimodal messages) are not evaluated. Only the text portions of a message are passed to checks.

Tool Calls

ScenarioEvaluated by
Model deciding to call a toolOutput guardrails
Tool result sent back to the modelInput guardrails (on the follow-up request)
Function definitions in the system promptInput guardrails
Tool call arguments (JSON) are treated as text β€” checks like Regex Match, Contains, and LLM-based detectors (e.g., Prompt Injection) all process this content.

Quick Reference

CapabilitySupported
/chat/completionsβœ“
/completionsβœ“
/embeddingsβœ“ (input only)
/messages (Anthropic)βœ“
/responsesβœ“
/prompts/{promptId}/completionsβœ“
All providers & modelsβœ“
Input guardrailsβœ“
Output guardrailsβœ“
Async executionβœ“ (default)
Sync executionβœ“
Streaming β€” inputβœ“
Streaming β€” outputβœ—
Image inputsβœ—
Text in multimodal messagesβœ“
Tool call textβœ“

Last modified on March 7, 2026