> ## Documentation Index
> Fetch the complete documentation index at: https://docs.portkey.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Multimodal Capabilities

<Info>
  This feature is available on all Portkey [plans](https://portkey.ai/pricing).
</Info>

The Gateway is your unified interface for **multimodal models**, along with chat, text, and embedding models.

Using the Gateway, you can call `vision`, `audio (text-to-speech & speech-to-text)`, `image generation` and other multimodal models from multiple providers (like `OpenAI`, `Anthropic`, `Stability AI`, etc.) — all using the familiar OpenAI signature.

<Frame>
  <img src="https://mintcdn.com/portkey-docs/Buc1Vm2P31GSPm3S/images/product/ai-gateway/multi.png?fit=max&auto=format&n=Buc1Vm2P31GSPm3S&q=85&s=34e4c2490d29a44209c737bf5d754f46" width="200" height="200" data-path="images/product/ai-gateway/multi.png" />
</Frame>

#### Explore the AI Gateway's Multimodal capabilities below:

<Card title="Vision" href="/product/ai-gateway/multimodal-capabilities/vision" />

<Card title="Image Generation" href="/product/ai-gateway/multimodal-capabilities/image-generation" />

<Card title="Function Calling" href="/product/ai-gateway/multimodal-capabilities/function-calling" />

<Card title="Speech-to-Text" href="/product/ai-gateway/multimodal-capabilities/speech-to-text" />

<Card title="Text-to-Speech" href="/product/ai-gateway/multimodal-capabilities/text-to-speech" />
