🌖 Announcing $3M Seed Round to Bring LLMs to Production Portkey is building a full-stack LLMOps platform that empowers AI builders to productionize their Gen AI apps reliably and securely.
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding - Summary This paper introduces the Skeleton-of-Thought (SoT) method to decrease the generation latency of large language models (LLMs). SoT guides LLMs to first generate the skeleton of the answer and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point. The m
My Journey with AI-Driven Development: From Curiosity to Necessity As I reflect upon my last 1 year coding journey, I am struck by how deeply Artificial Intelligence (AI) has woven itself into my development practices. From the likes of GitHub Copilot to the genius behind ChatGPT, my approach to writing code has undergone a transformative shift. Allow me to
⭐️ Analyze your LLM calls - 2.0 Portkey's analytics 2.0 give our users complete visibility into their LLM calls across requests, users, errors, cache and feedback.
⭐ Building Reliable LLM Apps: 5 Things To Know In this blog post, we explore a roadmap for building reliable large language model applications. Let’s get started!
⭐ Semantic Cache for Large Language Models Learn how semantic caching for large language models reduces cost, improves latency, and stabilizes high-volume AI applications by reusing responses based on intent, not just text.
Dive into what is LLMOps Rohit from Portkey is joined by Weaviate's Research Scientist Connor where they go on a deep dive about the differences between MLOps and LLMOps, building RAG systems, and what lies ahead for building production-grade LLM-based apps. This and much more in this podcast! Rohit Agarwal on Portkey -