⭐ Semantic Cache for Large Language Models Learn how semantic caching for large language models reduces cost, improves latency, and stabilizes high-volume AI applications by reusing responses based on intent, not just text.
⭐️ Decoding OpenAI Evals Learn how to use the eval framework to evaluate models & prompts to optimise LLM systems for the best outputs.
⭐️ Ranking LLMs with Elo Ratings Choosing an LLM from 50+ models available today is hard. We explore Elo ratings as a method to objectively rank and pick the best performers for our use case.