# Converting STDIO to Remote MCP Servers Source: https://docs.portkey.ai/docs/guides/converting-stdio-to-streamable-http Step-by-step guide to converting local STDIO MCP servers to production-ready Streamable HTTP servers This guide covers converting STDIO MCP servers to Streamable HTTP, the current standard for remote MCP deployments (protocol version 2025-03-26). All code examples follow correct initialization patterns to avoid common errors. ## Why Convert to Remote? Host your server on any cloud platform and make it globally accessible Handle multiple concurrent client connections simultaneously Easier integration with web apps, mobile apps, and distributed systems Deploy behind load balancers and scale as needed *** ## Understanding MCP Transports ### STDIO Transport **Best for:** Local development, single client ```plaintext theme={"system"} Client spawns server as subprocess → stdin/stdout communication ``` **Pros:** Zero network overhead, simple setup\ **Cons:** Same machine only, no multi-client support ### Streamable HTTP (Recommended) **Best for:** Production, cloud hosting, multiple clients ```plaintext theme={"system"} Server runs independently → Clients connect via HTTP ``` **Pros:** Single endpoint, bidirectional, optional sessions\ **Cons:** Requires web server configuration Streamable HTTP is the current standard (protocol version 2025-03-26). Use this for all new projects! ### SSE Transport (Legacy) **Status:** Superseded by Streamable HTTP SSE is no longer the standard. Only use for backward compatibility with older clients. *** ## Prerequisites ```bash Python theme={"system"} # Check Python version (need 3.10+) python --version # Install dependencies pip install mcp fastapi uvicorn # Optional: FastMCP for rapid development pip install fastmcp ``` ```bash TypeScript theme={"system"} # Check Node version (need 18+) node --version # Install dependencies npm install @modelcontextprotocol/sdk express npm install --save-dev @types/express ``` *** ## 1️⃣ Your Original STDIO Server Let's start with a typical STDIO server that runs locally: ```python Python theme={"system"} # stdio_server.py import asyncio from mcp.server import Server from mcp.server.stdio import stdio_server from mcp.types import Tool, TextContent server = Server("weather-server", version="1.0.0") @server.list_tools() async def list_tools() -> list[Tool]: return [ Tool( name="get_weather", description="Get weather for a location", inputSchema={ "type": "object", "properties": { "location": {"type": "string"} }, "required": ["location"] } ) ] @server.call_tool() async def call_tool(name: str, arguments: dict) -> list[TextContent]: if name == "get_weather": location = arguments.get("location", "Unknown") return [TextContent( type="text", text=f"Weather in {location}: Sunny, 72°F" )] raise ValueError(f"Unknown tool: {name}") async def main(): # STDIO transport - runs as subprocess async with stdio_server() as (read_stream, write_stream): await server.run( read_stream, write_stream, server.create_initialization_options() ) if __name__ == "__main__": asyncio.run(main()) ``` ```typescript TypeScript theme={"system"} // stdio_server.ts import { Server } from "@modelcontextprotocol/sdk/server/index.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { CallToolRequestSchema, ListToolsRequestSchema, } from "@modelcontextprotocol/sdk/types.js"; const server = new Server( { name: "weather-server", version: "1.0.0" }, { capabilities: { tools: {} } } ); server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: "get_weather", description: "Get weather for a location", inputSchema: { type: "object", properties: { location: { type: "string" }, }, required: ["location"], }, }, ], })); server.setRequestHandler(CallToolRequestSchema, async (request) => { if (request.params.name === "get_weather") { const location = request.params.arguments?.location || "Unknown"; return { content: [ { type: "text", text: `Weather in ${location}: Sunny, 72°F` }, ], }; } throw new Error(`Unknown tool: ${request.params.name}`); }); async function main() { const transport = new StdioServerTransport(); await server.connect(transport); } main(); ``` *** ## 2️⃣ Convert to Streamable HTTP ```python FastMCP expandable theme={"system"} # http_server.py from fastmcp import FastMCP # Create MCP server at startup // [!code highlight] mcp = FastMCP("weather-server") // [!code highlight] # Define your tool (same logic as before!) @mcp.tool() def get_weather(location: str) -> str: """Get weather for a location.""" return f"Weather in {location}: Sunny, 72°F" if __name__ == "__main__": # FastMCP handles transport initialization // [!code highlight] mcp.run( // [!code highlight] transport="http", // [!code highlight] host="0.0.0.0", // [!code highlight] port=8000, // [!code highlight] path="/mcp" // [!code highlight] ) // [!code highlight] ``` ```python FastAPI expandable theme={"system"} # http_server_fastapi.py import contextlib from fastapi import FastAPI from fastmcp import FastMCP # Create MCP server at startup // [!code highlight] mcp = FastMCP("weather-server", stateless_http=True) // [!code highlight] @mcp.tool() def get_weather(location: str) -> str: """Get weather for a location.""" return f"Weather in {location}: Sunny, 72°F" # Lifespan manager initializes MCP // [!code highlight] @contextlib.asynccontextmanager // [!code highlight] async def lifespan(app: FastAPI): // [!code highlight] async with contextlib.AsyncExitStack() as stack: // [!code highlight] await stack.enter_async_context(mcp.session_manager.run()) // [!code highlight] yield // [!code highlight] # Create FastAPI app with lifespan // [!code highlight] app = FastAPI(lifespan=lifespan) // [!code highlight] # Mount MCP server at /weather endpoint app.mount("/weather", mcp.streamable_http_app()) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000) ``` ```typescript TypeScript expandable theme={"system"} // http_server.ts import express from "express"; import { Server } from "@modelcontextprotocol/sdk/server/index.js"; import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; import { CallToolRequestSchema, ListToolsRequestSchema, } from "@modelcontextprotocol/sdk/types.js"; const app = express(); app.use(express.json()); // Create MCP server at startup // [!code highlight] const server = new Server( // [!code highlight] { name: "weather-server", version: "1.0.0" }, // [!code highlight] { capabilities: { tools: {} } } // [!code highlight] ); // [!code highlight] // Register handlers server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: "get_weather", description: "Get weather for a location", inputSchema: { type: "object", properties: { location: { type: "string" } }, required: ["location"], }, }, ], })); server.setRequestHandler(CallToolRequestSchema, async (request) => { if (request.params.name === "get_weather") { const location = request.params.arguments?.location || "Unknown"; return { content: [ { type: "text", text: `Weather in ${location}: Sunny, 72°F` }, ], }; } throw new Error(`Unknown tool: ${request.params.name}`); }); // Create transport at startup // [!code highlight] const transport = new StreamableHTTPServerTransport({ // [!code highlight] path: "/mcp", // [!code highlight] }); // [!code highlight] // Initialize server with transport // [!code highlight] async function initializeServer() { // [!code highlight] await server.connect(transport); // [!code highlight] console.log("✅ MCP server initialized"); // [!code highlight] } // [!code highlight] // Register transport handler app.use("/mcp", (req, res) => transport.handleRequest(req, res)); // Start server const PORT = 8000; app.listen(PORT, async () => { await initializeServer(); console.log(`🚀 Server running on http://0.0.0.0:${PORT}/mcp`); }); ``` **FastMCP vs FastAPI:** FastMCP provides a simpler API for quick setups. Use FastAPI when integrating MCP into existing FastAPI applications or when you need more control over the web server configuration. *** ## 3️⃣ Add auth Most STDIO servers use environment variables for authentication. Convert these to HTTP-based auth patterns for remote servers. ### Example: OAuth Credentials Pattern **STDIO Version** (environment variables): ```json Claude Desktop Config theme={"system"} { "mcpServers": { "google-calendar": { "command": "npx", "args": ["@cocal/google-calendar-mcp"], "env": { "GOOGLE_OAUTH_CREDENTIALS": "/path/to/gcp-oauth.keys.json" } } } } ``` **Remote Version** (request headers): ```python Python theme={"system"} from fastapi import Header, HTTPException, Depends import base64 import json def get_credentials(authorization: str = Header(None)) -> dict: """Extract credentials from Authorization header.""" if not authorization or not authorization.startswith("Bearer "): raise HTTPException(status_code=401, detail="Invalid auth") token = authorization.replace("Bearer ", "") try: return json.loads(base64.b64decode(token)) except Exception: raise HTTPException(status_code=401, detail="Invalid token") @app.post("/mcp") async def handle_mcp( request: Request, credentials: dict = Depends(get_credentials) ): # Use credentials from request pass ``` ```typescript TypeScript theme={"system"} function extractCredentials(authHeader: string | undefined): any { if (!authHeader || !authHeader.startsWith("Bearer ")) { throw new Error("Invalid authorization"); } const token = authHeader.replace("Bearer ", ""); try { return JSON.parse( Buffer.from(token, "base64").toString("utf-8") ); } catch (error) { throw new Error("Invalid token"); } } const authenticateMiddleware = (req: any, res: any, next: any) => { try { req.credentials = extractCredentials(req.headers.authorization); next(); } catch (error) { res.status(401).json({ error: error.message }); } }; app.use("/mcp", authenticateMiddleware); ``` ### Simpler Pattern: API Keys For basic authentication, use API keys: ```python Python theme={"system"} from fastapi import Header, HTTPException, Depends import os async def verify_api_key(authorization: str = Header(None)): """Verify API key from header.""" if not authorization: raise HTTPException(status_code=401, detail="Missing API key") api_key = authorization.replace("Bearer ", "") if api_key != os.getenv("API_KEY"): raise HTTPException(status_code=401, detail="Invalid API key") return api_key @app.post("/mcp") async def handle_mcp( request: Request, api_key: str = Depends(verify_api_key) ): # Request is authenticated pass ``` ```typescript TypeScript theme={"system"} const authenticateApiKey = (req: any, res: any, next: any) => { const authHeader = req.headers["authorization"]; if (!authHeader) { return res.status(401).json({ error: "Missing API key" }); } const apiKey = authHeader.replace("Bearer ", ""); if (apiKey !== process.env.API_KEY) { return res.status(401).json({ error: "Invalid API key" }); } next(); }; app.use("/mcp", authenticateApiKey); ``` *** ## 4️⃣ Run Your MCP Server Start your converted server: ```bash FastMCP theme={"system"} python http_server.py # Server runs at http://localhost:8000/mcp ``` ```bash FastAPI theme={"system"} python http_server_fastapi.py # Server runs at http://localhost:8000/weather/mcp ``` ```bash TypeScript theme={"system"} ts-node http_server.ts # Server runs at http://localhost:8000/mcp ``` *** ## 5️⃣ Testing with Hoot 🦉 Like Postman, but specifically designed for testing MCP servers. Perfect for development! ### Quick Start ```bash Install & Run theme={"system"} # Run directly (no installation needed!) npx -y @portkey-ai/hoot # Or install globally npm install -g @portkey-ai/hoot hoot ``` Hoot opens at `http://localhost:8009` ### Using Hoot ```bash theme={"system"} python http_server.py # Server runs at http://localhost:8000/mcp ``` Navigate to `http://localhost:8009` * Paste URL: `http://localhost:8000/mcp` * Hoot auto-detects the transport type! * View all available tools * Select `get_weather` * Add parameters: `{"location": "San Francisco"}` * Click "Execute" * See the response! ### Hoot Features Automatically detects HTTP vs SSE View and test all server tools Handles OAuth 2.1 authentication 8 themes with light & dark modes *** ## Optional: Session Management Session management is optional in the MCP spec. FastMCP handles it automatically if you need stateful interactions. ```python FastMCP theme={"system"} # FastMCP handles sessions automatically # Stateful mode (maintains session state) mcp = FastMCP("weather-server", stateless_http=False) # Stateless mode (no session state) mcp = FastMCP("weather-server", stateless_http=True) ``` ```python FastAPI theme={"system"} # Manual session management with FastAPI from fastmcp import FastMCP mcp = FastMCP("weather-server", stateless_http=True) @contextlib.asynccontextmanager async def lifespan(app: FastAPI): async with mcp.session_manager.run(): yield app = FastAPI(lifespan=lifespan) ``` ```typescript TypeScript theme={"system"} import { randomUUID } from "crypto"; const transports = new Map(); async function getOrCreateTransport( sessionId: string | undefined, isInitialize: boolean ): Promise { if (sessionId && transports.has(sessionId)) { return transports.get(sessionId)!; } if (!sessionId && isInitialize) { const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => randomUUID(), }); transport.onSessionInitialized = (newSessionId) => { transports.set(newSessionId, transport); }; await server.connect(transport); return transport; } throw new Error("Invalid session"); } app.post("/mcp", async (req, res) => { const sessionId = req.headers["mcp-session-id"] as string | undefined; const isInitialize = req.body?.method === "initialize"; try { const transport = await getOrCreateTransport(sessionId, isInitialize); await transport.handleRequest(req, res); } catch (error) { res.status(400).json({ error: error.message }); } }); ``` *** ## Optional: CORS Configuration Only add CORS if you need to support browser-based clients. For server-to-server communication, CORS isn't necessary. ```python FastMCP theme={"system"} # CORS with FastMCP mcp.run( transport="http", host="0.0.0.0", port=8000, cors_allow_origins=["https://yourdomain.com"] ) ``` ```python FastAPI theme={"system"} from fastapi.middleware.cors import CORSMiddleware app.add_middleware( CORSMiddleware, allow_origins=["https://yourdomain.com"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"], expose_headers=["Mcp-Session-Id"], ) ``` ```typescript TypeScript theme={"system"} import cors from "cors"; app.use(cors({ origin: ["https://yourdomain.com"], credentials: true, exposedHeaders: ["Mcp-Session-Id"], })); ``` *** ## Deployment Containerize for any platform Deploy in seconds Serverless on GCP ### Docker ```dockerfile Python theme={"system"} FROM python:3.11-slim WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . EXPOSE 8000 CMD ["python", "http_server.py"] ``` ```dockerfile TypeScript theme={"system"} FROM node:18-alpine WORKDIR /app COPY package*.json ./ RUN npm ci --production COPY . . RUN npm run build EXPOSE 8000 CMD ["node", "dist/http_server.js"] ``` ### Quick Deploy ```bash Fly.io theme={"system"} # Install Fly CLI curl -L https://fly.io/install.sh | sh # Deploy fly launch fly deploy ``` ```bash Cloud Run theme={"system"} gcloud run deploy mcp-server \ --source . \ --platform managed \ --region us-central1 \ --allow-unauthenticated ``` *** ## Troubleshooting **Check:** * Server is running on the correct port * Firewall allows connections * URL is correct (including `/mcp` path) **Test with curl:** ```bash theme={"system"} curl -X POST http://localhost:8000/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' ``` **Solution:** Ensure tool handlers are registered before the server starts ```python Correct Order theme={"system"} # Register handlers FIRST @mcp.tool() def my_tool(): pass # THEN run server mcp.run(transport="http") ``` **Solution:** Client must store and send session ID correctly ```python Session Handling theme={"system"} # Extract from initialization response session_id = response.headers.get("Mcp-Session-Id") # Include in all subsequent requests headers = {"Mcp-Session-Id": session_id} ``` *** ## Summary **You've successfully converted your STDIO server to a remote Streamable HTTP server!** ### Key Principles Replace STDIO with Streamable HTTP for remote access Convert environment variables to HTTP headers Server and transport created once at startup Use Hoot to verify all tools work correctly ### What We Covered 1. ✅ Original STDIO server structure 2. ✅ Converting to Streamable HTTP 3. ✅ Auth conversion from env vars to headers 4. ✅ Running your converted server 5. ✅ Testing with Hoot ### Resources Official protocol documentation Examples and source code Examples and source code Test your MCP servers High-level Python framework **Building something cool?** Share it with the MCP community and let us know how this guide helped! # Bring Your Own Guardrails Source: https://docs.portkey.ai/docs/integrations/guardrails/bring-your-own-guardrails Integrate your custom guardrails with Portkey using webhooks Portkey's webhook guardrails allow you to integrate your existing guardrail infrastructure with our AI Gateway. This is perfect for teams that have already built custom guardrail pipelines (like PII redaction, sensitive content filtering, or data validation) and want to: * Enforce guardrails directly within the AI request flow * Make existing guardrail systems production-ready * Modify AI requests and responses in real-time ## How It Works 1. You add a Webhook as a Guardrail Check in Portkey 2. When a request passes through Portkey's Gateway: * Portkey sends relevant data to your webhook endpoint * Your webhook evaluates the request/response and returns a verdict * Based on your webhook's response, Portkey either allows the request to proceed, modifies it if required, or applies your configured guardrail actions ## Setting Up a Webhook Guardrail ### Configure Your Webhook in Portkey App In the Guardrail configuration UI, you'll need to provide: | Field | Description | Type | | :-------------- | :--------------------------------------- | :------------ | | **Webhook URL** | Your webhook's endpoint URL | `string` | | **Headers** | Headers to include with webhook requests | `JSON` | | **Timeout** | Maximum wait time for webhook response | `number` (ms) | #### Webhook URL This should be a publicly accessible URL where your webhook is hosted. **Enterprise Feature**: Portkey Enterprise customers can configure secure access to webhooks within private networks. #### Headers Specify headers as a JSON object: ```json theme={"system"} { "Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json" } ``` #### Timeout The maximum time Portkey will wait for your webhook to respond before proceeding with a default `verdict: true`. * Default: `3000ms` (3 seconds) * If your webhook processing is time-intensive, consider increasing this value ### Webhook Request Structure Your webhook should accept `POST` requests with the following structure: #### Request Headers | Header | Description | | :------------- | :------------------------------------------- | | `Content-Type` | Always set to `application/json` | | Custom Headers | Any headers you configured in the Portkey UI | #### Request Body Portkey sends comprehensive information about the AI request to your webhook: Information about the user's request to the LLM OpenAI compliant request body json. Last message/prompt content from the overall request body. Whether the request uses streaming Information about the LLM's response (empty for beforeRequestHook) OpenAI compliant response body json. Last message/prompt content from the overall response body. HTTP status code from LLM provider Portkey provider slug. Example: `openai`, `azure-openai`, etc. Type of request: `chatComplete`, `complete`, or `embed` Custom metadata passed with the request. Can come from: 1) the `x-portkey-metadata` header, 2) default API key settings, or 3) workspace defaults. When the hook is triggered: `beforeRequestHook` or `afterRequestHook` #### Event Types Your webhook can be triggered at two points: * **beforeRequestHook**: Before the request is sent to the LLM provider * **afterRequestHook**: After receiving a response from the LLM provider ```JSON beforeRequestHook Example [expandable] theme={"system"} { "request": { "json": { "stream": false, "messages": [ { "role": "system", "content": "You are a helpful assistant" }, { "role": "user", "content": "Say Hi" } ], "max_tokens": 20, "n": 1, "model": "gpt-4o-mini" }, "text": "Say Hi", "isStreamingRequest": false, "isTransformed": false }, "response": { "json": {}, "text": "", "statusCode": null, "isTransformed": false }, "provider": "openai", "requestType": "chatComplete", "metadata": { "_user": "visarg123" }, "eventType": "beforeRequestHook" } ``` ```JSON afterRequestHook Example [expandable] theme={"system"} { "request": { "json": { "stream": false, "messages": [ { "role": "system", "content": "You are a helpful assistant" }, { "role": "user", "content": "Say Hi" } ], "max_tokens": 20, "n": 1, "model": "gpt-4o-mini" }, "text": "Say Hi", "isStreamingRequest": false, "isTransformed": false }, "response": { "json": { "id": "chatcmpl-B9SAAj7zd4mq12omkeEImYvYnjbOr", "object": "chat.completion", "created": 1741592910, "model": "gpt-4o-mini-2024-07-18", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hi! How can I assist you today?", "refusal": null }, "logprobs": null, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 18, "completion_tokens": 10, "total_tokens": 28, "prompt_tokens_details": { "cached_tokens": 0, "audio_tokens": 0 }, "completion_tokens_details": { "reasoning_tokens": 0, "audio_tokens": 0, "accepted_prediction_tokens": 0, "rejected_prediction_tokens": 0 } }, "service_tier": "default", "system_fingerprint": "fp_06737a9306" }, "text": "Hi! How can I assist you today?", "statusCode": 200, "isTransformed": false }, "provider": "openai", "requestType": "chatComplete", "metadata": { "_user": "visarg123" }, "eventType": "afterRequestHook" } ``` ### Webhook Response Structure Your webhook must return a response that follows this structure: #### Response Body Whether the request/response passes your guardrail check: * `true`: No violations detected * `false`: Violations detected Optional field to modify the request or response Modified request data (only for beforeRequestHook) If this field is found in the Webhook response, Portkey will fully override the existing request body with the returned data. Modified response data (only for afterRequestHook) If this field is found in the Webhook response, Portkey will fully override the existing response body with the returned data. ## Webhook Capabilities Your webhook can perform three main actions: ### Simple Validation Return a verdict without modifying the request/response: ```json theme={"system"} { "verdict": true // or false if the request violates your guardrails } ``` ### Request Transformation Modify the user's request before it reaches the LLM provider: ```json theme={"system"} { "verdict": true, "transformedData": { "request": { "json": { "messages": [ { "role": "system", "content": "You are a helpful assistant. Do not provide harmful content." }, { "role": "user", "content": "Original user message" } ], "max_tokens": 100, "model": "gpt-4o" } } } } ``` ```json theme={"system"} { "verdict": true, "transformedData": { "request": { "json": { "messages": [ { "role": "system", "content": "You are a helpful assistant" }, { "role": "user", "content": "My name is [REDACTED] and my email is [REDACTED]" } ], "max_tokens": 100, "model": "gpt-4o" } } } } ``` ### Response Transformation Modify the LLM's response before it reaches the user: ```json theme={"system"} { "verdict": true, "transformedData": { "response": { "json": { "id": "chatcmpl-123", "object": "chat.completion", "created": 1741592832, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "I've filtered this response to comply with our content policies." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 23, "completion_tokens": 12, "total_tokens": 35 } }, "text": "I've filtered this response to comply with our content policies." } } } ``` ```json theme={"system"} { "verdict": true, "transformedData": { "response": { "json": { "id": "chatcmpl-123", "object": "chat.completion", "created": 1741592832, "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Original response with additional disclaimer: This response is provided for informational purposes only." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 23, "completion_tokens": 20, "total_tokens": 43 } }, "text": "Original response with additional disclaimer: This response is provided for informational purposes only." } } } ``` ## Passing Metadata to Your Webhook You can include additional context with each request using Portkey's metadata feature: ```json theme={"system"} // In your API request to Portkey "x-portkey-metadata": {"user": "john", "context": "customer_support"} ``` This metadata will be forwarded to your webhook in the `metadata` field. [Learn more about metadata](/product/observability/metadata). ## Important Implementation Notes 1. **Complete Transformations**: When using `transformedData`, include all fields in your transformed object, not just the changed portions. 2. **Independent Verdict and Transformation**: The `verdict` and any transformations are independent. You can return `verdict: false` while still returning transformations. 3. **Default Behavior**: If your webhook fails to respond within the timeout period, Portkey will default to `verdict: true`. 4. **Event Type Awareness**: When implementing transformations, ensure your webhook checks the `eventType` field to determine whether it's being called before or after the LLM request. ## Example Implementation Check out our Guardrail Webhook implementation on GitHub: ## Get Help Building custom webhooks? Join the [Portkey Discord community](https://portkey.ai/community) for support and to share your implementation experiences! # Portkey Features Source: https://docs.portkey.ai/docs/introduction/feature-overview Explore the powerful features of Portkey ## AI Gateway Connect to 250+ AI models using a single consistent API. Set up load balancers, automated fallbacks, caching, conditional routing, and more, seamlessly. Integrate with multiple AI models through a single API Implement simple and semantic caching for improved performance Set up automated fallbacks for enhanced reliability Handle various data types with multimodal AI capabilities Implement automatic retries for improved resilience Configure per-strategy circuit protection and failure handling Distribute workload efficiently across multiple models Manage access to LLMs Set and manage request timeouts Implement canary testing for safe deployments Route requests based on specific conditions Set and manage budget limits ## Observability & Logs Gain real-time insights, track key metrics, and streamline debugging with our OpenTelemetry-compliant system. Access and analyze detailed logs Implement distributed tracing for request flows Gain insights through comprehensive analytics Apply filters for targeted analysis Manage and utilize metadata effectively Collect and analyze user feedback ## Prompt Library Collaborate with team members to create, templatize, and version prompt templates easily. Experiment across 250+ LLMs with a strong Publish/Release flow to deploy the prompts. Create and manage reusable prompt templates Utilize modular prompt components Advanced prompting with JSON mode ## Guardrails Enforce Real-Time LLM Behavior with 50+ state-of-the-art AI guardrails, so that you can synchronously run Guardrails on your requests and route them with precision. Implement rule-based safety checks Leverage AI for advanced content filtering Integrate third-party safety solutions Customize guardrails to your needs ## Agents Natively integrate Portkey's gateway, guardrails, and observability suite with leading agent frameworks and take them to production. ## More Resources Compare different Portkey subscription plans Join our community of developers Explore our comprehensive API documentation Learn about our enterprise solutions Contribute to our open-source projects # Make Your First Request Source: https://docs.portkey.ai/docs/introduction/make-your-first-request Integrate Portkey and analyze your first LLM call in 2 minutes! ## 1. Get your Portkey API Key [Create](https://app.portkey.ai/signup) or [log in](https://app.portkey.ai/login) to your Portkey account. Grab your account's API key from the "Settings" page. Copy your Portkey account API key Based on your access level, you might see the relevant permissions on the API key modal - tick the ones you'd like, name your API key, and save it. ## 2. Integrate Portkey Portkey offers a variety of integration options, including SDKs, REST APIs, and native connections with platforms like OpenAI, Langchain, and LlamaIndex, among others. ### Through the OpenAI SDK If you're using the **OpenAI SDK**, import the Portkey SDK and configure it within your OpenAI client object: ### Portkey SDK You can also use the **Portkey SDK / REST APIs** directly to make the chat completion calls. This is a more versatile way to make LLM calls across any provider: Once, the integration is ready, you can view the requests reflect on your Portkey dashboard.