Speech-to-Text

On this page

Transcription & Translation Usage
Supported Providers and Models

Transcription & Translation Usage

Portkey supports both Transcription and Translation methods for STT models and follows the OpenAI signature where you can send the file (in flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm formats) as part of the API request. Here’s an example: OpenAI NodeJSOpenAI PythonREST

import fs from "fs";
import OpenAI from "openai";
import { PORTKEY_GATEWAY_URL, createHeaders } from 'portkey-ai'

const openai = new OpenAI({
  apiKey: "dummy", // We are using Virtual Key from Portkey
  baseURL: PORTKEY_GATEWAY_URL,
  defaultHeaders: createHeaders({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "OPENAI_VIRTUAL_KEY"
  })
});

// Transcription

async function transcribe() {
  const transcription = await openai.audio.transcriptions.create({
    file: fs.createReadStream("/path/to/file.mp3"),
    model: "whisper-1",
  });

  console.log(transcription.text);
}
transcribe();

// Translation

async function translate() {
    const translation = await openai.audio.translations.create({
        file: fs.createReadStream("/path/to/file.mp3"),
        model: "whisper-1",
    });
    console.log(translation.text);
}
translate();

On completion, the request will get logged in the logs UI where you can see trasncribed or translated text, along with the cost and latency incurred.

Supported Providers and Models

The following providers are supported for speech-to-text with more providers getting added soon. Please raise a request or a PR to add model or provider to the AI gateway.

Provider	Models	Functions
OpenAI	whisper-1	Transcription Translation

Text-to-Speech Thinking Mode

Introduction

Product

Support

Transcription & Translation Usage

Supported Providers and Models

Introduction

Product

Support

​Transcription & Translation Usage

​Supported Providers and Models

Transcription & Translation Usage

Supported Providers and Models