OpenAI - Portkey Docs

Provider Slug

openai

Latest Pricing | API Status | Supported Endpoints

OpenAI’s API offers powerful language, embedding, and multimodal models (gpt-4o, o1, whisper, dall-e, etc.). Portkey makes your OpenAI requests production-ready with its observability, fallbacks, guardrails, and more features. Portkey also lets you use OpenAI API’s other capabilities like

Integrate

Just paste your OpenAI API Key from here to Portkey to create your Virtual Key.

Your OpenAI personal or service account API keys can be saved to Portkey. Additionally, your OpenAI Admin API Keys can also be saved to Portkey so that you can route to OpenAI Admin routes through Portkey API.

Optional

Add your OpenAI organization and project ID details: (Docs)
Directly use OpenAI API key without the Virtual Key: (Docs)
Create a short-lived virtual key OR one with usage/rate limits: (Docs)

Note: While OpenAI supports setting budget & rate limits at Project level, on Portkey, along with that, you can set granular budget & rate limits per each key.

Sample Request

Portkey is a drop-in replacement for OpenAI. You can make request using the official OpenAI or Portkey SDKs.

Popular libraries & agent frameworks like LangChain, CrewAI, AutoGen, etc. are also supported. All Azure OpenAI models & endpoints are also supported

Install the Portkey SDK with npm

npm install portkey-ai

import Portkey from 'portkey-ai';

const client = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'PROVIDER_VIRTUAL_KEY'
});

async function main() {
  const response = await client.chat.completions.create({
    messages: [{ role: "user", content: "Bob the builder.." }],
    model: "gpt-4o",
  });

  console.log(response.choices[0].message.content);
}

main();

Install the Portkey SDK with npm

npm install portkey-ai

import Portkey from 'portkey-ai';

const client = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'PROVIDER_VIRTUAL_KEY'
});

async function main() {
  const response = await client.chat.completions.create({
    messages: [{ role: "user", content: "Bob the builder.." }],
    model: "gpt-4o",
  });

  console.log(response.choices[0].message.content);
}

main();

Install the Portkey SDK with pip

pip install portkey-ai

from portkey_ai import Portkey

client = Portkey(
  api_key = "PORTKEY_API_KEY",
  virtual_key = "PROVIDER_VIRTUAL_KEY"
)

response = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(response.choices[0].message)

curl https://api.portkey.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-portkey-api-key: $PORTKEY_API_KEY" \
  -H "x-portkey-virtual-key: $PORTKEY_PROVIDER_VIRTUAL_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "user", "content": "Hello!" }
    ]
  }'

Install the OpenAI & Portkey SDKs with pip

pip install openai portkey-ai

from openai import OpenAI
from portkey_ai import createHeaders, PORTKEY_GATEWAY_URL

client = OpenAI(
    api_key="xx",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        api_key="PORTKEY_API_KEY",
        virtual_key="OPENAI_VIRTUAL_KEY"
    )
)

completion = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

Install the OpenAI & Portkey SDKs with npm

npm install openai portkey-ai

import OpenAI from 'openai';
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'

const openai = new OpenAI({
  apiKey: 'xx',
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "OPENAI_VIRTUAL_KEY"
  })
});

async function main() {
  const completion = await openai.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }],
    model: 'gpt-4o',
  });

  console.log(chatCompletion.choices);
}

main();

…

Viewing the Log

Portkey will log your request and give you useful data such as timestamp, request type, LLM used, tokens generated, and cost. For multimodal models, Portkey will also show the image sent with vision/image models, as well as the image generated.

Local Setup

If you do not want to use Portkey’s hosted API, you can also run Portkey locally:

Portkey runs on our popular open source Gateway. You can spin it up locally to make requests without sending them to the Portkey API.

npx @portkey-ai/gateway

Your Gateway is running on http://localhost:8080/v1 🚀

Then, just change the baseURL to the local Gateway URL, and make requests:

import Portkey from 'portkey-ai';

const client = new Portkey({
  baseUrl: 'http://localhost:8080/v1',
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'PROVIDER_VIRTUAL_KEY'
});

async function main() {
  const response = await client.chat.completions.create({
    messages: [{ role: "user", content: "Bob the builder.." }],
    model: "gpt-4o",
  });

  console.log(response.choices[0].message.content);
}

main();

On-Prem Deployment (AWS, GCP, Azure) Portkey’s data & control planes can be fully deployed on-prem with the Enterprise license.

More details here →

Support for OpenAI Capabilities

Portkey works with all of OpenAI’s endpoints and supports all OpenAI capabilities like prompt caching, structured outputs, and more.

OpenAI Tool Calling

Enables models to interact with external tools by declaring functions that the model can invoke based on conversation context.

OpenAI Structured Outputs

Returns model responses in predefined formats (JSON/XML) for consistent, parseable application integration.

OpenAI Vision

Analyzes images alongside text, enabling visual understanding and question-answering through URL or base64 inputs.

OpenAI Embeddings

Transforms text into numerical vectors for semantic search, clustering, and recommendations.

OpenAI Prompt Caching

Automatically reuses results from similar API requests to reduce latency and costs, with no setup required.

OpenAI Image Generation

Creates and modifies images using DALL·E models, with DALL·E 3 for generation and DALL·E 2 for editing.

OpenAI STT

Converts audio to text using Whisper model, supporting multiple languages and formats.

OpenAI TTS

Transforms text into natural speech using six voices, with streaming support and multiple audio formats.

OpenAI Realtime API

Powers low-latency, multi-modal conversations through WebRTC and WebSocket connections.

OpenAI Moderations

Screens text content for harmful or inappropriate material.

OpenAI Reasoning

Provides step-by-step problem-solving through structured logical analysis.

OpenAI Predicted Outputs

Shows probability distributions of possible responses with confidence levels.

OpenAI Fine-tuning

Customizes models on specific datasets for improved domain performance.

OpenAI Assistants

Offers managed, stateful AI agents with tool use and conversation memory.

OpenAI Batch Inference API

Processes large volumes of requests efficiently in batch mode.

Find examples for each below:

OpenAI Tool Calling

Tool calling feature lets models trigger external tools based on conversation context. You define available functions, the model chooses when to use them, and your application executes them and returns results.

Portkey supports OpenAI Tool Calling and makes it interoperable across multiple providers. With Portkey Prompts, you can templatize various your prompts & tool schemas as well.

Get Weather Tool

let tools = [{
    type: "function",
    function: {
        name: "getWeather",
        description: "Get the current weather",
        parameters: {
            type: "object",
            properties: {
                location: { type: "string", description: "City and state" },
                unit: { type: "string", enum: ["celsius", "fahrenheit"] }
            },
            required: ["location"]
        }
    }
}];

let response = await portkey.chat.completions.create({
    model: "gpt-4o",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What's the weather like in Delhi - respond in JSON" }
    ],
    tools,
    tool_choice: "auto",
});

console.log(response.choices[0].finish_reason);

Get Weather Tool

let tools = [{
    type: "function",
    function: {
        name: "getWeather",
        description: "Get the current weather",
        parameters: {
            type: "object",
            properties: {
                location: { type: "string", description: "City and state" },
                unit: { type: "string", enum: ["celsius", "fahrenheit"] }
            },
            required: ["location"]
        }
    }
}];

let response = await portkey.chat.completions.create({
    model: "gpt-4o",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What's the weather like in Delhi - respond in JSON" }
    ],
    tools,
    tool_choice: "auto",
});

console.log(response.choices[0].finish_reason);

Get Weather Tool

tools = [{
    "type": "function",
    "function": {
        "name": "getWeather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
}]

response = portkey.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's the weather like in Delhi - respond in JSON"}
    ],
    tools=tools,
    tool_choice="auto"
)

print(response.choices[0].finish_reason)

Get Weather Tool

curl -X POST "https://api.portkey.ai/v1/chat/completions" \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer YOUR_PORTKEY_API_KEY" \
     -d '{
       "model": "gpt-4o",
       "messages": [
         {"role": "system", "content": "You are a helpful assistant."},
         {"role": "user", "content": "What'\''s the weather like in Delhi - respond in JSON"}
       ],
       "tools": [{
         "type": "function",
         "function": {
           "name": "getWeather",
           "description": "Get the current weather",
           "parameters": {
             "type": "object",
             "properties": {
               "location": {"type": "string", "description": "City and state"},
               "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
             },
             "required": ["location"]
           }
         }
       }],
       "tool_choice": "auto"
     }'

Tracing the Request

On Portkey you can easily trace the whole tool call - from defining tool schemas to getting the final LLM output:

OpenAI Structured Outputs

Use structured outputs for more consistent and parseable responses:

Structured Outputs Guide

Discover how to use structured outputs with OpenAI models in Portkey.

OpenAI Vision

OpenAI’s vision models can analyze images alongside text, enabling visual question-answering capabilities. Images can be provided via URLs or base64 encoding in user messages.

response = portkey.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response)

Tracing Vision Requests

You can see the image(s) sent on your Portkey log:

Uploading Base64 encoded images

If you have an image or set of images locally, you can pass those to the model in base 64 encoded format. Check out this example from OpenAI on how to do this.

Vision Model Limitation | Vision FAQs

OpenAI Embeddings

OpenAI’s embedding models (like text-embedding-3-small) transform text inputs into lists of floating point numbers - smaller distances between vectors indicate higher text similarity. They power use cases like semantic search, content clustering, recommendations, and anomaly detection.

Simply send text to the embeddings API endpoint to generate these vectors for your applications.

response = portkey.embeddings.create(
    input="Your text string goes here",
    model="text-embedding-3-small"
)

print(response.data[0].embedding)

Embedding FAQs

OpenAI Prompt Caching

Prompt caching automatically reuses results from similar API requests, reducing latency by up to 80% and costs by 50%. This feature works by default for all OpenAI API calls, requires no setup, and has no additional fees.

Portkey accurately logs the usage statistics and costs for your cached requests.

Prompt Caching Guide

Read more about OpenAI Prompt Caching here.

Prompt Caching Limitations | Prompt Caching FAQs

OpenAI Image Generations (DALL-E)

OpenAI’s Images API enables AI-powered image generation, manipulation, and variation creation for creative and commercial applications. Whether you’re building image generation features, editing tools, or creative applications, the API provides powerful visual AI capabilities through DALL·E models.

The API offers three core capabilities:

Generate new images from text prompts (DALL·E 3, DALL·E 2)
Edit existing images with text-guided replacements (DALL·E 2)
Create variations of existing images (DALL·E 2)

import Portkey from 'portkey-ai';

const client = new Portkey({
  apiKey: 'PORTKEY_API_KEY',
  virtualKey: 'PROVIDER_VIRTUAL_KEY'
});

async function main() {
  const image = await client.images.generate({
    model: "dall-e-3",
    prompt: "A cute baby sea otter"
  });

  console.log(image.data);
}
main();

Tracing Image Generation Requests

Portkey logs the generated image along with your whole request:

Image Generations Limitations | Image Generations FAQs

OpenAI Transcription & Translation (Whisper)

OpenAI’s Audio API converts speech to text using the Whisper model. It offers transcription in the original language and translation to English, supporting multiple file formats and languages with high accuracy.

audio_file= open("/path/to/file.mp3", "rb")

# Transcription
transcription = portkey.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file
)
print(transcription.text)

# Translation
translation = portkey.audio.translations.create(
  model="whisper-1",
  file=audio_file
)
print(translation.text)

Speech-to-Text Limitations | Speech-to-text FAQs

OpenAI Text to Speech

OpenAI’s Text to Speech (TTS) API converts written text into natural-sounding audio using six distinct voices. It supports multiple languages, streaming capabilities, and various audio formats for different use cases.

from pathlib import Path

speech_file_path = Path(__file__).parent / "speech.mp3"
response = portkey.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

with open(speech_file_path, "wb") as f:
    f.write(response.content)

Text-to-Speech Limitations | Text-to-Speech FAQs

OpenAI Realtime API

OpenAI’s Realtime API enables dynamic, low-latency conversations combining text, voice, and function calling capabilities. Built on GPT-4o models optimized for realtime interactions, it supports both WebRTC for client-side applications and WebSockets for server-side implementations.

Portkey enhances OpenAI’s Realtime API with production-ready features:

Complete request/response logging for realtime streams
Cost tracking and budget management for streaming sessions
Multi-modal conversation monitoring
Session-based analytics and debugging

The API bridges the gap between traditional request-response patterns and interactive, real-time AI experiences, with Portkey adding the reliability and observability needed for production deployments. Developers can access this functionality through two model variants:

gpt-4o-realtime for full capabilities
gpt-4o-mini-realtime for lighter applications

Realtime API Guide

More Capabilities

Streaming

Predicted Outputs

Fine-Tuning

Batch Inference

Assistants

Moderations

Reasoning

Portkey Features

Track End-User IDs

Portkey allows you to track user IDs passed with the user parameter in OpenAI requests, enabling you to monitor user-level costs, requests, and more:

response = portkey.chat.completions.create(
  model="gpt-4o",
  messages=[{"role": "user", "content": "Say this is a test"}],
  user="user_123456"
)

When you include the user parameter in your requests, Portkey logs will display the associated user ID, as shown in the image below:

In addition to the user parameter, Portkey allows you to send arbitrary custom metadata with your requests. This powerful feature enables you to associate additional context or information with each request, which can be useful for analysis, debugging, or other custom use cases.

Learn More About Metadata

Explore how to use custom metadata to enhance your request tracking and analysis.

Setup Fallbacks & Loadbalancer

Here’s a simplified version of how to use Portkey’s Gateway Configuration:

Create a Gateway Configuration

You can create a Gateway configuration using the Portkey Config Dashboard or by writing a JSON configuration in your code. In this example, requests are routed based on the user’s subscription plan (paid or free).

config = {
  "strategy": {
    "mode": "conditional",
    "conditions": [
      {
        "query": { "metadata.user_plan": { "$eq": "paid" } },
        "then": "gpt4o"
      },
      {
        "query": { "metadata.user_plan": { "$eq": "free" } },
        "then": "gpt-3.5"
      }
    ],
    "default": "base-gpt4"
  },
  "targets": [
    {
      "name": "gpt4o",
      "virtual_key": "xx"
    },
    {
      "name": "gpt-3.5",
      "virtual_key": "yy"
    }
  ]
}

Process Requests

When a user makes a request, it will pass through Portkey’s AI Gateway. Based on the configuration, the Gateway routes the request according to the user’s metadata.

Set Up the Portkey Client

Pass the Gateway configuration to your Portkey client. You can either use the config object or the Config ID from Portkey’s hosted version.

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",
    virtual_key="VIRTUAL_KEY",
    config=portkey_config
)

That’s it! Portkey seamlessly allows you to make your AI app more robust using built-in gateway features. Learn more about advanced gateway features:

Load Balancing

Distribute requests across multiple targets based on defined weights.

Fallbacks

Automatically switch to backup targets if the primary target fails.

Conditional Routing

Route requests to different targets based on specified conditions.

Caching

Enable caching of responses to improve performance and reduce costs.

Setup Guardrails

Portkey’s AI gateway enables you to enforce input/output checks on requests by applying custom hooks before and after processing. Protect your user’s/company’s data by using PII guardrails and many more available on Portkey Guardrails:

{
	"virtual_key":"openai-xxx",
	"before_request_hooks": [{
		"id": "input-guardrail-id-xx"
	}],
	"after_request_hooks": [{
		"id": "output-guardrail-id-xx"
	}]
}

Learn More About Guardrails

Explore Portkey’s guardrail features to enhance the security and reliability of your AI applications.

Cache Requests

Send Custom Metadata

Setup Rate Limits

Create & Deploy Prompt Templates

Popular Libraries

You can make your OpenAI integrations with popular libraries also production-ready and reliable with native integrations.

OpenAI with Langchain

OpenAI with LangGraph

OpenAI with LibreChat

OpenAI with CrewAI

OpenAI with Llamaindex

OpenAI with Vercel

More Libraries

Other popular projects

Other agent frameworks

Cookbooks

Setup a fallback from OpenAI to Azure OpenAI

A/B test your prompts

Appendix

OpenAI Projects & Organizations

Managing OpenAI Orgs on Portkey

Organization management is particularly useful if you belong to multiple organizations or are accessing projects through a legacy OpenAI user API key. Specifying the organization and project IDs also helps you maintain better control over your access rules, usage, and costs.

In Portkey, you can add your OpenAI Org & Project details by Using Virtual Keys, Using Configs, or While Making a Request.

Using Virtual Keys

Using Configs

You can also specify the organization and project details in your request config, either at the root level or within a specific target.

{
	"provider": "openai",
	"api_key": "OPENAI_API_KEY",
	"openai_organization": "org-xxxxxx",
	"openai_project": "proj_xxxxxxxx"
}

While Making a Request

Pass OpenAI organization and project details directly when making a request:

from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    api_key="OPENAI_API_KEY",
    organization="org-xxxxxxxxxx",
    project="proj_xxxxxxxxx",
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="openai",
        api_key="PORTKEY_API_KEY"
    )
)

chat_complete = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Say this is a test"}],
)

print(chat_complete.choices[0].message.content)

Supported Parameters

List of supported & unsupported parameters from OpenAI

Method / Endpoint	Supported Parameters
`completions`	model, prompt, max_tokens, temperature, top_p, n, stream, logprobs, echo, stop, presence_penalty, frequency_penalty, best_of, logit_bias, user, seed, suffix
`embeddings`	model, input, encoding_format, dimensions, user
`chat.completions`	model, messages, functions, function_call, max_tokens, temperature, top_p, n, stream, stop, presence_penalty, frequency_penalty, logit_bias, user, seed, tools, tool_choice, response_format, logprobs, top_logprobs, stream_options, service_tier, parallel_tool_calls, max_completion_tokens
`image.generations`	prompt, model, n, quality, response_format, size, style, user
`create.speech`	model, input, voice, response_format, speed
`create.transcription`	All parameters supported
`create.translation`	All parameters supported

Supported Models

List of OpenAI models supported by Portkey

Limitations

Portkey does not support the following OpenAI features:

Streaming for audio endpoints

Limitations for Vision Requests

Medical images: Vision models are not suitable for interpreting specialized medical images like CT scans and shouldn’t be used for medical advice.
Non-English: The models may not perform optimally when handling images with text of non-Latin alphabets, such as Japanese or Korean.
Small text: Enlarge text within the image to improve readability, but avoid cropping important details.
Rotation: The models may misinterpret rotated / upside-down text or images.
Visual elements: The models may struggle to understand graphs or text where colors or styles like solid, dashed, or dotted lines vary.
Spatial reasoning: The models struggle with tasks requiring precise spatial localization, such as identifying chess positions.
Accuracy: The models may generate incorrect descriptions or captions in certain scenarios.
Image shape: The models struggle with panoramic and fisheye images.
Metadata and resizing: The models do not process original file names or metadata, and images are resized before analysis, affecting their original dimensions.
Counting: May give approximate counts for objects in images.
CAPTCHAS: For safety reasons, CAPTCHA submissions are blocked by OpenAI.

Image Generations Limitations

DALL·E 3 Restrictions:
- Only supports image generation (no editing or variations)
- Limited to one image per request
- Fixed size options: 1024x1024, 1024x1792, or 1792x1024 pixels
- Automatic prompt enhancement cannot be disabled
Image Requirements:
- Must be PNG format
- Maximum file size: 4MB
- Must be square dimensions
- For edits/variations: input images must meet same requirements
Content Restrictions:
- All prompts and images are filtered based on OpenAI’s content policy
- Violating content will return an error
- Edited areas must be described in full context, not just the edited portion
Technical Limitations:
- Image URLs expire after 1 hour
- Image editing (inpainting) and variations only available in DALL·E 2
- Response format limited to URL or Base64 data

Speech-to-text Limitations

File Restrictions:
- Maximum file size: 25 MB
- Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm
- No streaming support
Language Limitations:
- Translation output available only in English
- Variable accuracy for non-listed languages
- Limited control over generated audio compared to other language models
Technical Constraints:
- Prompt limited to first 244 tokens
- Restricted processing for longer audio files
- No real-time transcription support

Text-to-Speech Limitations

Voice Restrictions:
- Limited to 6 pre-built voices (alloy, echo, fable, onyx, nova, shimmer)
- Voices optimized primarily for English
- No custom voice creation support
- No direct control over emotional range or tone
Audio Quality Trade-offs:
- tts-1: Lower latency but potentially more static
- tts-1-hd: Higher quality but increased latency
- Quality differences may vary by listening device
Usage Requirements:
- Must disclose AI-generated nature to end users
- Cannot create custom voice clones
- Performance varies for non-English languages

FAQs

General

How to get the OpenAI API key?

Is is free to use the OpenAI API key?

I am getting rate limited on OpenAI API

Vision FAQs

Can I fine-tune OpenAI models on vision requests?

Can I use gpt-4o or other chat models to generate images?

What type of files can I upload for vision requests?

For vision requests, Iis there a limit to the size of the image I can upload?

How do rate limits work for vision requests?

Can models understand image metadata?

Embedding FAQs

How can I tell how many tokens a string has before I embed it?

How can I retrieve K nearest embedding vectors quickly?

Do V3 embedding models know about recent events?

Prompt Caching FAQs

How is data privacy maintained for caches?

Does Prompt Caching affect output token generation or the final response of the API?

Is there a way to manually clear the cache?

Will I be expected to pay extra for writing to Prompt Caching?

Do cached prompts contribute to TPM rate limits?

Is discounting for Prompt Caching available on Scale Tier and the Batch API?

Does Prompt Caching work on Zero Data Retention requests?

Image Generations FAQs

What's the difference between DALL·E 2 and DALL·E 3?

How long do the generated image URLs last?

What are the size requirements for uploading images?

Can I disable DALL·E 3's automatic prompt enhancement?

How many images can I generate per request?

What image formats are supported?

How does image editing (inpainting) work?

Speech-to-text FAQs

What audio file formats are supported?

Can I translate audio to languages other than English?

How do I handle audio files longer than 25 MB?

Does the API support all languages equally well?

Can I get timestamps in the transcription?

How can I improve transcription accuracy for specific terms?

What's the difference between transcription and translation?

Text-to-Speech FAQs

What are the differences between TTS-1 and TTS-1-HD models?

Which audio formats are supported?

Can I create or clone custom voices?

How well does it support non-English languages?

Can I control the emotional tone or style of the speech?

Is real-time streaming supported?

Do I need to disclose that the audio is AI-generated?

On this page

Integrate
Sample Request
Local Setup
Support for OpenAI Capabilities
OpenAI Tool Calling
OpenAI Structured Outputs
OpenAI Vision
OpenAI Embeddings
OpenAI Prompt Caching
OpenAI Image Generations (DALL-E)
OpenAI Transcription & Translation (Whisper)
OpenAI Text to Speech
OpenAI Realtime API
More Capabilities
Portkey Features
Popular Libraries
OpenAI with Langchain
OpenAI with LangGraph
OpenAI with LibreChat
OpenAI with CrewAI
OpenAI with Llamaindex
OpenAI with Vercel
More Libraries
Cookbooks
Appendix
OpenAI Projects & Organizations
Supported Parameters
Supported Models
Limitations
Limitations for Vision Requests
Image Generations Limitations
Speech-to-text Limitations
Text-to-Speech Limitations
FAQs
General
Vision FAQs
Embedding FAQs
Prompt Caching FAQs
Image Generations FAQs
Speech-to-text FAQs
Text-to-Speech FAQs

Monthly Summary

Enterprise Releases

Product Releases

SDK Releases

​Integrate

​Sample Request

​Local Setup

More details here →

​Support for OpenAI Capabilities

OpenAI Tool Calling

OpenAI Structured Outputs

OpenAI Vision

OpenAI Embeddings

OpenAI Prompt Caching

OpenAI Image Generation

OpenAI STT

OpenAI TTS

OpenAI Realtime API

OpenAI Moderations

OpenAI Reasoning

OpenAI Predicted Outputs

OpenAI Fine-tuning

OpenAI Assistants

OpenAI Batch Inference API

​OpenAI Tool Calling

​OpenAI Structured Outputs

Structured Outputs Guide

​OpenAI Vision

​OpenAI Embeddings

​OpenAI Prompt Caching

Prompt Caching Guide

​OpenAI Image Generations (DALL-E)

​OpenAI Transcription & Translation (Whisper)

​OpenAI Text to Speech

​OpenAI Realtime API

Realtime API Guide

​More Capabilities

​Portkey Features

​Popular Libraries

​OpenAI with Langchain

​OpenAI with LangGraph

​OpenAI with LibreChat

​OpenAI with CrewAI

​OpenAI with Llamaindex

​OpenAI with Vercel

​More Libraries

Other popular projects

Other agent frameworks

​Cookbooks

Setup a fallback from OpenAI to Azure OpenAI

A/B test your prompts

​Appendix

​OpenAI Projects & Organizations

​Supported Parameters

​Supported Models

​Limitations

​Limitations for Vision Requests

​Image Generations Limitations

​Speech-to-text Limitations

​Text-to-Speech Limitations

​FAQs

​General

​Vision FAQs

​Embedding FAQs

​Prompt Caching FAQs

​Image Generations FAQs

​Speech-to-text FAQs

​Text-to-Speech FAQs

Integrate

Sample Request

Local Setup

Support for OpenAI Capabilities

OpenAI Tool Calling

OpenAI Structured Outputs

OpenAI Vision

OpenAI Embeddings

OpenAI Prompt Caching

OpenAI Image Generations (DALL-E)

OpenAI Transcription & Translation (Whisper)

OpenAI Text to Speech

OpenAI Realtime API

More Capabilities

Portkey Features

Popular Libraries

OpenAI with Langchain

OpenAI with LangGraph

OpenAI with LibreChat

OpenAI with CrewAI

OpenAI with Llamaindex

OpenAI with Vercel

More Libraries

Cookbooks

Appendix

OpenAI Projects & Organizations

Supported Parameters

Supported Models

Limitations

Limitations for Vision Requests

Image Generations Limitations

Speech-to-text Limitations

Text-to-Speech Limitations

FAQs

General

Vision FAQs

Embedding FAQs

Prompt Caching FAQs

Image Generations FAQs

Speech-to-text FAQs

Text-to-Speech FAQs