Multimodal Capabilities
Text-to-Speech
Portkey’s AI gateway currently supports text-to-speech models on OpenAI
and Azure OpenAI
.
Usage
We follow the OpenAI signature where you can send the input text and the voice option as a part of the API request. All the output formats mp3
, opus
, aac
, flac
, and pcm
are supported. Portkey also supports real time audio streaming for TTS models.
Here’s an example:
On completion, the request will get logged in the logs UI and show the cost and latency incurred.
Supported Providers and Models
The following providers are supported for text-to-speech with more providers getting added soon. Please raise a request or a PR to add model or provider to the AI gateway.
Provider | Models |
---|---|
OpenAI | tts-1 tts-1-hd |
Azure OpenAI | tts-1 tts-1-hd |
Deepgram (Coming Soon) | |
ElevanLabs (Coming Soon) |
Was this page helpful?