inference cost - Portkey Blog

inference cost

AI cost observability: A practical guide to understanding and managing LLM spend

A clear, actionable guide to AI cost observability—what it is, where costs leak, the metrics that matter, and how teams can manage LLM spend with visibility, governance, and FinOps discipline.

batching

Simplifying LLM batch inference

LLM batch inference promises lower costs and fewer rate limits, but providers make it complex. See how Portkey simplifies batching with a unified API, direct outputs, and transparent pricing.

paper summaries

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance - Summary

The paper discusses the cost associated with querying large language models (LLMs) and proposes FrugalGPT, a framework that uses LLM APIs to process natural language queries within a budget constraint. The framework uses prompt adaptation, LLM approximation, and LLM cascade to reduce the inference