We hosted a watch party for OpenAI's DevDay on our Discord channel and had a lot of fun discussing everything new and improved that was launched. If you're just catching up, read about all the updates here on the OpenAI website. Since we're all about LLM Apps in Production, let's
💡This is Portkey's first collaboration with the Hasura Team. Hasura helps you build robust RAG data pipelines by unifying multiple private data sources (relational DB, vector DB, etc.) and letting you query the data securely with production-grade controls. LLMs have been around for some time now and have shown that
Over the past few months, we've been keenly observing latencies for both GPT 3.5 & 4. The emerging patterns have been intriguing. The standout observation? GPT-4 is catching up in speed, closing the latency gap with GPT 3.5. Our findings reveal a consistent decline in GPT-4 latency. While your
This paper presents a method for compressing prompts in large language models (LLMs) to accelerate model inference and reduce cost. The method involves a budget controller, a token-level iterative compression algorithm, and an instruction tuning based method for distribution alignment. Experimental
It's been some time since Llama 2's celebrated launch and we've seen the dust settle a bit and real use cases come to life. In this blog post, we answer frequently asked questions on Llama 2's capabilities and when should you be using it. Let's dive in! What is Llama
As developers and founders, you might find yourself asking how does your startup differentiate from ChatGPT. More so, how do you convince a customer to try your product over a generic Chat client.
Portkey is building a full-stack LLMOps platform that empowers AI builders to productionize their Gen AI apps reliably and securely.
This paper introduces the Skeleton-of-Thought (SoT) method to decrease the generation latency of large language models (LLMs). SoT guides LLMs to first generate the skeleton of the answer and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point. The m
As I reflect upon my last 1 year coding journey, I am struck by how deeply Artificial Intelligence (AI) has woven itself into my development practices. From the likes of GitHub Copilot to the genius behind ChatGPT, my approach to writing code has undergone a transformative shift. Allow me to
Portkey's analytics 2.0 give our users complete visibility into their LLM calls across requests, users, errors, cache and feedback.