Tracking LLM token usage across providers, teams and workloads Learn how organizations track, attribute, and control LLM token usage across teams, workloads, and providers and why visibility is key to governance and efficiency.
AI cost observability: A practical guide to understanding and managing LLM spend A clear, actionable guide to AI cost observability—what it is, where costs leak, the metrics that matter, and how teams can manage LLM spend with visibility, governance, and FinOps discipline.
Simplifying LLM batch inference LLM batch inference promises lower costs and fewer rate limits, but providers make it complex. See how Portkey simplifies batching with a unified API, direct outputs, and transparent pricing.
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance - Summary The paper discusses the cost associated with querying large language models (LLMs) and proposes FrugalGPT, a framework that uses LLM APIs to process natural language queries within a budget constraint. The framework uses prompt adaptation, LLM approximation, and LLM cascade to reduce the inference