1 Trillion Tokens and the Death of the Chatbot
I recently sat down with Siddharth Ahluwalia on the Neon Show and said a few things I probably shouldn't have.
Most SaaS companies as we know them are on their way to becoming glorified databases.
I meant it. Let me explain why.
btw, here's the podcast: https://www.youtube.com/watch?v=lSgxAKaeREw
We're Routing 1 Trillion Tokens of Production Data Every Day. Here's What It's Telling Us.
A year ago, Portkey was processing about 1 billion tokens a day. Today, we're at 1 Trillion tokens flowing through our systems everyday. Traffic is growing 30-40% every single week.
These aren't demo tokens. These aren't someone's weekend hackathon. This is production traffic: real enterprises running real workloads against real customers.
When you sit at the gateway layer and watch this kind of volume move, you start seeing patterns that nobody else can see. Not the model labs, not the cloud providers, not the VCs making predictions on Twitter. The infrastructure layer has the best seat in the house.
Here are five shifts we're watching play out right now, and none of them are what most people expect.
1. The Chatbot Is Dead. Long Live the Harness.
The era of the chatbot, the simple Q&A box where a user asks something and a model spits out an answer, is over. It was a useful starting point. It's now a dead end.
What's replacing it is what we call the "AI Harness." The difference is fundamental. A chatbot talks. A harness executes.
The simplest version of a harness is: make an LLM call, make a tool call, loop again. That's it. But inside that loop is where all the value lives. The LLM becomes the brain inside a nervous system of tools: updating databases, triggering workflows, sending emails, making decisions.
If you're still building chatbots in 2026, you're building a typewriter in the age of the word processor. The industry has moved on. The question is whether you've moved with it.
2. AI Spend Will Flip the Cloud
Here's a prediction that sounds crazy until you do the math: AI operational spend is going to surpass traditional cloud spend.
AWS is a $100+ billion revenue business. It is the foundation of the modern internet. But the hierarchy of value is being reordered. Cloud vendors are increasingly becoming "compute providers" - commodity infrastructure. The model labs, such as OpenAI, Anthropic, and Google, are becoming the "value providers." They're the ones capturing the intelligence premium.
There's a real possibility that these labs emerge as $200 billion revenue companies in their own right.
Think about it this way. If AI agents end up doing the majority of business work, then the operational management of those agents: observability, routing, security, cost controls, naturally becomes a bigger market than traditional infrastructure management. AI Ops becomes the new DevOps. Except that the surface area is orders of magnitude larger.
3. The "AI Employee" Isn't a Buzzword Anymore
I shared a story on the podcast that I think captures where we are better than any chart could.
A clothing company in Jaipur worked with a tool called SagePilot to deploy an AI-powered customer support system. They went from a team of 16 people down to 3. That's not a 10% efficiency gain. That's 80% automation of knowledge work, in production, today.
But here's the part that stayed with me. The AI persona which they called "Ria", was so good, so context-aware, so human in her interactions, that customers started walking into the physical store asking to speak with her. They thought she was a real person. They wanted to meet her.
We need to sit with that for a moment.
The reason this works is that AI memory now surpasses human capability in specific domains. An AI system can store preferences, conversation history, and nuances across thousands of interactions. No human support agent can do that. When you combine an LLM with a structured memory store and specific business rules, you get a System of Execution that handles the heavy lifting and leaving the final 20% to human supervisors.
This is not the future. This is last quarter.
4. The Penny-Pinching Era Is Over
This one surprises people the most.
At Portkey, we have what we internally call the "Model Ticker", it tracks token volume across models like a stock market ticker. You'd expect the cheapest models to dominate. They don't.
The highest growth in token volume right now is coming from the most expensive models. Opus 4.6. Opus 4.5. The premium tier.
The reason is brutally simple: the cost of an intelligent mistake is far higher than the cost of a premium token.
Enterprises have done the math. If you can augment an employee with a high-end AI assistant for $500 a month, roughly 10% of their salary, and it results in a 2x or 3x productivity gain, that's not an expense. That's the best investment on your balance sheet.
The market has spoken. Intelligence is worth the premium. The race to the bottom on cost-per-token was always a sideshow. The real race is to the top on reliability and reasoning quality.
5. The Gateway Is the New Standard Architecture
Every enterprise we talk to is dealing with two emotions: fear and anxiety.
Fear: "What's happening inside the black box? Why did the model hallucinate? Which team is spending what?"
Anxiety: "We're locked into one vendor. What if a better model comes out tomorrow? What if our provider has an outage?"
The AI Gateway has emerged as the answer to both. It's a centralized control plane that sits between your applications and your models, and it gives you three things you can't get any other way:
Observability. You can see exactly where value is being generated and where it's being wasted. Which teams are getting ROI. Which prompts are failing. Where the money is going.
Security. Firewalls, governance, and compliance at the gateway level. You can shut down bot attacks or data leaks in real time, from a single pane.
AI Neutrality. This is the big one. A gateway lets you switch between Bedrock, Vertex, OpenAI, and Anthropic with a single toggle. No re-architecture. No vendor lock-in. When a better model shows up on the Model Ticker, you just re-route traffic.
This neutrality isn't a nice-to-have anymore. It's the enterprise moat.
Where This All Leads
Here's the uncomfortable truth that ties all five shifts together.
Traditional SaaS companies, the ones we've built and funded and celebrated for twenty years, are at risk of becoming mere Systems of Record. Sophisticated databases. Important, yes. But not where the value lives anymore.
The value is migrating to Systems of Execution. The agentic layers that actually do the work. The harnesses, the AI employees, the autonomous workflows that run on premium models through governed gateways.
Every enterprise leader needs to ask themselves one question: Is your AI strategy built for simple Q&A, or is it built for a world where AI drives 80% of your production workloads?
Because the transition from chatbot to autonomous agent isn't a technical upgrade.
It's the new baseline for survival.
I distilled my thoughts from the conversation with Siddhartha here. To watch the full podcast, head to: