Production Guides
Supercharging Open-source LLMs: Your Gateway to 250+ Models
The Rise of Open-source LLMs in Production
Production Guides
The Rise of Open-source LLMs in Production
Conversations
Portkey's LLMs in Prod series hit Bangalore, bringing together AI practitioners to tackle real-world challenges in productionizing AI apps. From AI gateways to agentic workflows to DSPy at scale, here's what's shaping the future of AI in production.
Updates
AI Governance features, Telemetry for Agents, Cookbooks, Claude 3.5, and more.
Updates
Set budget & rate limits, streamline team permissions, GPT-4o, Gemini Flash, Instructor & Promptfoo integrations, and more.
Production Guides
2024 is the year where Gen AI innovation and productisation happens hand-in-hand. We are seeing companies and enterprises move their Gen AI prototypes to production at a breathtaking pace. At the same time, an exciting new shift is also taking place in how you can interact with LLMs: completely new
Walmart
Last month, the LLMs in the Prod community had the pleasure of hosting Rohit Chatter, Chief Software Architect at Walmart Tech Global, for a fireside chat on Gen AI and semantic caching in retail. This conversation spanned a wide range of topics, from Rohit's personal journey in the
Production Guides
In two days (i.e. Jan 4), OpenAI will retire 33 models, including GPT-3 (text-davinci-003) and various others. This is OpenAI's biggest model deprecation so far. Here's what you need to know: GPT-3 Model Retirement The text-davinci-003 model (commonly known as GPT-3) will be unavailable from
RAG
💡This is Portkey's first collaboration with the Hasura Team. Hasura helps you build robust RAG data pipelines by unifying multiple private data sources (relational DB, vector DB, etc.) and letting you query the data securely with production-grade controls. LLMs have been around for some time now and have
Benchmarks
Over the past few months, we've been keenly observing latencies for both GPT 3.5 & 4. The emerging patterns have been intriguing. The standout observation? GPT-4 is catching up in speed, closing the latency gap with GPT 3.5. Our findings reveal a consistent decline in GPT-4
Production Guides
Implementing semantic cache from scratch for production use cases.
Conversations
Rohit from Portkey is joined by Weaviate's Research Scientist Connor where they go on a deep dive about the differences between MLOps and LLMOps, building RAG systems, and what lies ahead for building production-grade LLM-based apps. This and much more in this podcast! Rohit Agarwal on Portkey -
Conversations
Portkey CEO Rohit Agarwal shares practical tips from his own experience on crafting production-grade & reliable LLM systems. Read more LLM reliability tips here.