Portkey provides a robust and secure gateway to facilitate the integration of various Large Language Models (LLMs) into your applications, including Google Gemini APIs.With Portkey, you can take advantage of features like fast AI gateway access, observability, prompt management, and more, all while ensuring the secure management of your LLM API keys through a virtual key system.
To use Gemini with Portkey, get your API key from here, then add it to Portkey to create the virtual key.
NodeJS SDK
Python SDK
Copy
Ask AI
import Portkey from 'portkey-ai'const portkey = new Portkey({ apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"] provider:"@PROVIDER" // Your Google Virtual Key})
Use the Portkey instance to send requests to Google Gemini. You can also override the virtual key directly in the API call if needed.
NodeJS SDK
Python SDK
Copy
Ask AI
const chatCompletion = await portkey.chat.completions.create({ messages: [ { role: 'system', content: 'You are not a helpful assistant' }, { role: 'user', content: 'Say this is a test' } ], model: 'gemini-1.5-pro',});console.log(chatCompletion.choices);
Portkey supports the system_instructions parameter for Google Gemini 1.5 - allowing you to control the behavior and output of your Gemini-powered applications with ease.Simply include your Gemini system prompt as part of the {"role":"system"} message within the messages array of your request body. Portkey Gateway will automatically transform your message to ensure seamless compatibility with the Google Gemini API.
Gemini models are inherently multimodal, capable of processing and understanding content from a wide array of file types. Portkey streamlines the integration of these powerful features by providing a unified, OpenAI-compatible API.
The Portkey Advantage: A Unified Format for All MediaTo simplify development, Portkey uses a consistent format for all multimodal requests. Whether you’re sending an image, audio, video, or document, you will use an object with type: 'image_url' within the user message’s content array.Portkey’s AI Gateway intelligently interprets your request—based on the URL or data URI you provide—and translates it into the precise format required by the Google Gemini API. This means you only need to learn one structure for all your media processing needs.
Method 1: Sending an Image via Google Files URLUse the Google Files API to upload your image and get a URL. This is the recommended approach for larger files or when you need persistent storage.
To upload files and get Google Files URLs, use the Files API. The URL format will be similar to: https://generativelanguage.googleapis.com/v1beta/files/[FILE_ID]
Method 2: Sending a Local Image as Base64 DataUse this method for local image files. The file is encoded into a Base64 string and sent as a data URI. This is ideal for smaller files when you don’t want to use the Files API.The data URI format is: data:<MIME_TYPE>;base64,<YOUR_BASE64_DATA>
Method 1: Sending a Document via Google Files URLUpload your PDF using the Files API to get a Google Files URL.
Copy
Ask AI
const chatCompletion = await portkey.chat.completions.create({ model: 'gemini-1.5-pro', messages: [{ role: 'user', content: [ { type: 'image_url', image_url: { url: 'https://generativelanguage.googleapis.com/v1beta/files/your-pdf-file-id' } }, { type: 'text', text: 'Summarize the key findings of this research paper.' } ] }],});console.log(chatCompletion.choices[0].message.content);
Method 2: Sending a Local Document as Base64 DataThis is suitable for smaller, local PDF files.
Copy
Ask AI
import fs from 'fs';const pdfBytes = fs.readFileSync('whitepaper.pdf');const base64Pdf = pdfBytes.toString('base64');const pdfUri = `data:application/pdf;base64,${base64Pdf}`;const chatCompletion = await portkey.chat.completions.create({ model: 'gemini-1.5-pro', messages: [{ role: 'user', content: [ { type: 'image_url', image_url: { url: pdfUri }}, { type: 'text', text: 'What is the main conclusion of this document?' } ] }],});console.log(chatCompletion.choices[0].message.content);
While you can send other document types like .txt or .html, they will be treated as plain text. Gemini’s native document vision capabilities are optimized for the application/pdf MIME type.
Important: For all file uploads (except YouTube videos), it’s recommended to use the Google Files API to upload your files first, then use the returned file URL in your requests. This approach provides better performance and reliability for larger files.
Gemini can use a built-in code interpreter tool to solve complex computational problems, perform calculations, and generate code. To enable this, simply include the code_execution tool in your request. The model will automatically decide when to invoke it.
Copy
Ask AI
const response = await portkey.chat.completions.create({ model: "gemini-1.5-pro", messages: [{ "role": "user", "content": "Calculate the 20th Fibonacci number. Then find the nearest palindrome to it." }], tools: [{ "type": "code_execution" }]});console.log(response.choices[0].message.content);
Vertex AI supports grounding with Google Search. This is a feature that allows you to ground your LLM responses with real-time search results.
Grounding is invoked by passing the google_search tool (for newer models like gemini-2.0-flash-001), and google_search_retrieval (for older models like gemini-1.5-flash) in the tools array.
Copy
Ask AI
"tools": [ { "type": "function", "function": { "name": "google_search" // or google_search_retrieval for older models } }]
If you mix regular tools with grounding tools, vertex might throw an error saying only one tool can be used at a time.
The assistants thinking response is returned in the response_chunk.choices[0].delta.content_blocks array, not the response.choices[0].message.content string.
Models like gemini-2.5-flash-preview-04-17gemini-2.5-flash-preview-04-17 support extended thinking.
This is similar to openai thinking, but you get the model’s reasoning as it processes the request as well.Note that you will have to set strict_open_ai_compliance=False in the headers to use this feature.
from portkey_ai import Portkey# Initialize the Portkey clientportkey = Portkey( api_key="PORTKEY_API_KEY", # Replace with your Portkey API key provider="@PROVIDER", strict_open_ai_compliance=False)# Create the requestresponse = portkey.chat.completions.create( model="gemini-2.5-flash-preview-04-17", max_tokens=3000, thinking={ "type": "enabled", "budget_tokens": 2030 }, stream=True, messages=[ { "role": "user", "content": [ { "type": "text", "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?" } ] } ])print(response)# in case of streaming responses you'd have to parse the response_chunk.choices[0].delta.content_blocks array# response = portkey.chat.completions.create(# ...same config as above but with stream: true# )# for chunk in response:# if chunk.choices[0].delta:# content_blocks = chunk.choices[0].delta.get("content_blocks")# if content_blocks is not None:# for content_block in content_blocks:# print(content_block)
To disable thinking for gemini models like gemini-2.5-flash-preview-04-17, you are required to explicitly set budget_tokens to 0.