Portkey provides a robust and secure gateway to facilitate the integration of various Large Language Models (LLMs) into your applications, including models hosted on AWS Bedrock.
With Portkey, you can take advantage of features like fast AI gateway access, observability, prompt management, and more, all while ensuring the secure management of your LLM API keys through a virtual key system.
import Portkey from 'portkey-ai'const portkey = new Portkey({ apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"] virtualKey: "VIRTUAL_KEY" // Your Bedrock Virtual Key})
import Portkey from 'portkey-ai'const portkey = new Portkey({ apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"] virtualKey: "VIRTUAL_KEY" // Your Bedrock Virtual Key})
from portkey_ai import Portkeyportkey = Portkey( api_key="PORTKEY_API_KEY", # Replace with your Portkey API key virtual_key="VIRTUAL_KEY" # Replace with your virtual key for Bedrock)
If you’re using AWS Security Token Service, you can pass your aws_session_token along with the Virtual key:
import Portkey from 'portkey-ai'const portkey = new Portkey({ apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"] virtualKey: "VIRTUAL_KEY" // Your Bedrock Virtual Key, aws_session_token: ""})
import Portkey from 'portkey-ai'const portkey = new Portkey({ apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"] virtualKey: "VIRTUAL_KEY" // Your Bedrock Virtual Key, aws_session_token: ""})
from portkey_ai import Portkeyportkey = Portkey( api_key="PORTKEY_API_KEY", # Replace with your Portkey API key virtual_key="VIRTUAL_KEY" # Replace with your virtual key for Bedrock, aws_session_token="")
Use the Portkey instance to send requests to Anthropic. You can also override the virtual key directly in the API call if needed.
const chatCompletion = await portkey.chat.completions.create({ messages: [{ role: 'user', content: 'Say this is a test' }], model: 'anthropic.claude-v2:1', max_tokens: 250 // Required field for Anthropic});console.log(chatCompletion.choices);
const chatCompletion = await portkey.chat.completions.create({ messages: [{ role: 'user', content: 'Say this is a test' }], model: 'anthropic.claude-v2:1', max_tokens: 250 // Required field for Anthropic});console.log(chatCompletion.choices);
completion = portkey.chat.completions.create( messages= [{ "role": 'user', "content": 'Say this is a test' }], model= 'anthropic.claude-v2:1', max_tokens=250 # Required field for Anthropic)print(completion.choices)
The assistants thinking response is returned in the response_chunk.choices[0].delta.content_blocks array, not the response.choices[0].message.content string.
Models like us.anthropic.claude-3-7-sonnet-20250219-v1:0 support extended thinking.
This is similar to openai thinking, but you get the model’s reasoning as it processes the request as well.
from portkey_ai import Portkey# Initialize the Portkey clientportkey = Portkey( api_key="PORTKEY_API_KEY", # Replace with your Portkey API key virtual_key="VIRTUAL_KEY", # Add your provider's virtual key strict_openai_compliance=False)# Create the requestresponse = portkey.chat.completions.create( model="us.anthropic.claude-3-7-sonnet-20250219-v1:0", max_tokens=3000, thinking={ "type": "enabled", "budget_tokens": 2030 }, stream=True, messages=[ { "role": "user", "content": [ { "type": "text", "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?" } ] } ])print(response)# in case of streaming responses you'd have to parse the response_chunk.choices[0].delta.content_blocks array# response = portkey.chat.completions.create(# ...same config as above but with stream: true# )# for chunk in response:# if chunk.choices[0].delta:# content_blocks = chunk.choices[0].delta.get("content_blocks")# if content_blocks is not None:# for content_block in content_blocks:# print(content_block)
from portkey_ai import Portkey# Initialize the Portkey clientportkey = Portkey( api_key="PORTKEY_API_KEY", # Replace with your Portkey API key virtual_key="VIRTUAL_KEY", # Add your provider's virtual key strict_openai_compliance=False)# Create the requestresponse = portkey.chat.completions.create( model="us.anthropic.claude-3-7-sonnet-20250219-v1:0", max_tokens=3000, thinking={ "type": "enabled", "budget_tokens": 2030 }, stream=True, messages=[ { "role": "user", "content": [ { "type": "text", "text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?" } ] }, { "role": "assistant", "content": [ { "type": "thinking", "thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.", "signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE=" } ] }, { "role": "user", "content": "thanks that's good to know, how about to chennai?" } ])print(response)
Inference profiles are a resource in Amazon Bedrock that define a model and one or more Regions to which the inference profile can route model invocation requests.
To use inference profiles, your IAM role needs to have the following permissions:
Portkey uses the AWS Converse API internally for making chat completions requests.
If you need to pass additional input fields or parameters like anthropic_beta, top_k, frequency_penalty etc. that are specific to a model, you can pass it with this key:
You can manage all prompts to AWS bedrock in the Prompt Library. All the current models of Anthropic are supported and you can easily start testing different prompts.
Once you’re ready with your prompt, you can use the portkey.prompts.completions.create interface to use the prompt in your application.