Skip to main content
Vertex AI offers wide ranging support for embedding text, images and videos. Portkey provides a standardized interface for embedding multiple modalities.

Gemini Embedding Models

The gemini-embedding-2-preview model supports embedding across multiple modalities — text, image, video, and audio — through a single unified endpoint.
Input TypeSupported Formats
TextPlain string or structured object
ImageGCS URI, HTTPS URL, base64, data URI
VideoGCS URI, HTTPS URL, base64
AudioGCS URI, HTTPS URL, base64
Additional supported parameters: task_type, dimensions

Embedding Text

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY",
    provider="@PROVIDER",
    vertex_region="us-central1",
)

embeddings = client.embeddings.create(
    model="gemini-embedding-2-preview",
    input="What is the meaning of life?",
)

Embedding Images

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY",
    provider="@PROVIDER",
    vertex_region="us-central1",
)

embeddings = client.embeddings.create(
    model="gemini-embedding-2-preview",
    input=[
        {
            "image": {
                "url": "gs://your-bucket/image.png",
                "mime_type": "image/png"
            }
        }
    ],
)
You can also pass images as base64:
{
    "input": [
        {
            "image": {
                "base64": "iVBORw0KGgoAAAANSUhEUgAA...",
                "mime_type": "image/png"
            }
        }
    ]
}
Or as a data URI:
{
    "input": [
        {
            "image": {
                "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
            }
        }
    ]
}

Embedding Videos

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY",
    provider="@PROVIDER",
    vertex_region="us-central1",
)

embeddings = client.embeddings.create(
    model="gemini-embedding-2-preview",
    input=[
        {
            "video": {
                "url": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
                "mime_type": "video/mp4"
            }
        }
    ],
)

Embedding Audio

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY",
    provider="@PROVIDER",
    vertex_region="us-central1",
)

embeddings = client.embeddings.create(
    model="gemini-embedding-2-preview",
    input=[
        {
            "audio": {
                "url": "gs://cloud-samples-data/generative-ai/audio/Chirp-3-Docs-Dive.mp3",
                "mime_type": "audio/mpeg"
            }
        }
    ],
)

Multimodal Embedding (Mixed Inputs)

You can combine multiple input types in a single request:
from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY",
    provider="@PROVIDER",
    vertex_region="us-central1",
)

embeddings = client.embeddings.create(
    model="gemini-embedding-2-preview",
    input=[
        {
            "video": {
                "url": "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
                "mime_type": "video/mp4"
            }
        },
        {
            "audio": {
                "url": "gs://cloud-samples-data/generative-ai/audio/Chirp-3-Docs-Dive.mp3",
                "mime_type": "audio/mpeg"
            }
        }
    ],
)

Setting Task Type and Dimensions

You can optionally specify task_type and dimensions to control the embedding behavior:
{
    "model": "gemini-embedding-2-preview",
    "input": "What is the meaning of life?",
    "task_type": "RETRIEVAL_DOCUMENT",
    "dimensions": 768
}

Legacy Embedding Models

The following sections cover the older Vertex AI embedding models like textembedding-gecko@003 and multimodalembedding@001.

Embedding Text

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY", # defaults to os.environ.get("PORTKEY_API_KEY")
    provider="@PROVIDER",
)

embeddings = client.embeddings.create(
  model="textembedding-gecko@003",
  input_type="classification",
  input="The food was delicious and the waiter...",
  # input=["text to embed", "more text to embed"], # if you would like to embed multiple texts
)

Embeddings Images

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY", # defaults to os.environ.get("PORTKEY_API_KEY")
    provider="@PROVIDER",
)

embeddings = client.embeddings.create(
  model="multimodalembedding@001",
  input=[
          {
              "text": "this is the caption of the image",
              "image": {
                  "base64": "UklGRkacAABXRUJQVlA4IDqcAACQggKdASqpAn8B.....",
                  # "url": "gcs://..." # if you want to use a url
              }
          }
      ]
)

Embeddings Videos

from portkey_ai import Portkey

client = Portkey(
    api_key="YOUR_PORTKEY_API_KEY", # defaults to os.environ.get("PORTKEY_API_KEY")
    provider="@PROVIDER",
)

embeddings = client.embeddings.create(
  model="multimodalembedding@001",
  input=[
          {
              "text": "this is the caption of the video",
              "video": {
                  "base64": "UklGRkacAABXRUJQVlA4IDqcAACQggKdASqpAn8B.....",
                  "start_offset": 0,
                  "end_offset": 10,
                  "interval": 5,
                  # "url": "gcs://..." # if you want to use a url
              }
          }
      ]
)
Last modified on April 17, 2026