⭐️ Analyze your LLM calls - 2.0 Portkey's analytics 2.0 give our users complete visibility into their LLM calls across requests, users, errors, cache and feedback.
⭐ Building Reliable LLM Apps: 5 Things To Know In this blog post, we explore a roadmap for building reliable large language model applications. Let’s get started!
⭐️ Decoding OpenAI Evals Learn how to use the eval framework to evaluate models & prompts to optimise LLM systems for the best outputs.
⭐️ Ranking LLMs with Elo Ratings Choosing an LLM from 50+ models available today is hard. We explore Elo ratings as a method to objectively rank and pick the best performers for our use case.
Self-Consistency Improves Chain of Thought Reasoning in Language Models - Summary The paper proposes a new decoding strategy called self-consistency to improve the performance of chain-of-thought prompting in language models for complex reasoning tasks. Self-consistency first samples a diverse set of reasoning paths and then selects the most consistent answer by marginalizing ou
The Power of Scale for Parameter-Efficient Prompt Tuning - Summary The paper explores prompt tuning, a mechanism for learning soft prompts to condition frozen language models for specific downstream tasks. The approach outperforms GPT-3's few-shot learning and becomes more competitive with scale. Prompt tuning confers benefits in robustness to domain transfer and
GPT Understands, Too - Summary The paper proposes a novel method called P-tuning, which employs trainable continuous prompt embeddings to improve the performance of GPTs on natural language understanding (NLU) tasks. The method is shown to be better than or comparable to similar-sized BERTs on NLU tasks and substantially improve