Everything We Know About Claude Code Limits

Everything We Know About Claude Code Limits
Last updated - March 2026

TL;DR

  • For Claude.ai Users:
    The core limit is a 5-hour rolling session that begins with your first prompt. Weekly quotas (introduced August 28, 2025) now apply to heavy users on Claude Pro and Claude Max plans. All Claude.ai plans share a common usage bucket across the Claude app and Claude Code; Max plans multiply the allowance accordingly. Max subscribers can also purchase additional usage at standard API rates once they hit limits.
  • For API Users:
    API usage is separate and billed pay-as-you-go per token. The latest models, Opus 4.6 and Sonnet 4.6, are priced at $5/$25 and $3/$15 per million tokens respectively. You get the same rates whether you use Anthropic, AWS Bedrock, Google Vertex AI, or Microsoft Foundry.
    You can choose whichever region, VPC, or IAM model best fits your stack.
  • For Portkey Users:
    With Portkey, you can route Claude Code through any supported provider i.e., Anthropic, Bedrock, Vertex, Foundry, and others, from a single endpoint, pulling in maximum capacity from multiple sources simultaneously. Portkey also lets you centralize cost controls, add security guardrails, configure automatic failover, and get audit logs automatically.
0:00
/0:12

Understanding the 5‑Hour session model

The moment you run claude in your terminal, a 5‑hour rolling window begins. All messages and token spend during that period draw from your plan’s pool; the clock resets only when you send the next message after the 5 hours lapse. Pro users average 10‑40 prompts per window, while Max 20× users can push 200‑800 prompts depending on code size and model choice. (Source: Anthropic Help Center)

💡 Tips

  1. Time your first prompt so the window spans your peak coding block; fire a dummy ping at 13:01 to start a fresh window for the afternoon.
  2. Batch edits: one long diff request burns fewer tokens than several “please refine” follow‑ups.
  3. Use /compact in Claude Code to manage context costs and /model to switch models based on task complexity.

Weekly Limits (Live Since August 28, 2025)

In addition to the 5-hour rolling windows, Anthropic now enforces weekly rate limits that reset every seven days. These include an overall usage cap plus a separate weekly cap for Opus models. Anthropic says the limits affect fewer than 5% of subscribers based on usage patterns.

Here's what to expect per week:

PlanWeekly Sonnet hoursWeekly Opus hours
Pro ($20/mo)40-80--
Max 5x ($100/mo)140-28015-35
Max 20x ($200/mo)240-48024-40

Usage varies based on codebase size, model complexity, and session behavior. Max subscribers can purchase additional usage beyond the weekly limit at standard API rates.

Plan pricing & capacities

PlanMonthly feeRelative capacityTypical Claude Code prompts / 5 hModel access
Free$0~1/10 Pro2-5 (Sonnet 4.6 only)Sonnet 4.6
Pro$20baseline10-40Sonnet 4.6
Max 5x$1005x Pro50-200Sonnet 4.6 + Opus 4.6
Max 20x$20020x Pro200-800Sonnet 4.6 + Opus 4.6
Team (Standard)$25/user----Sonnet 4.6
Team (Premium)$150/user--Includes Claude CodeSonnet 4.6 + Opus 4.6

Pro plans are also available at $17/month when billed annually. Team plans require a minimum of 5 users.

Token‑Based API Pricing (Console, Bedrock, Vertex)

ModelAnthropic APIAWS BedrockGoogle Vertex AIMicrosoft Foundry
Opus 4.6$5 in / $25 out per M tokenssame list pricesame list pricesame list price
Sonnet 4.6$3 in / $15 outsamesamesame

Previous generation (still available)

ModelAnthropic APIAWS BedrockGoogle Vertex AIMicrosoft Foundry
Opus 4.5$5 in / $25 outsamesamesame
Sonnet 4.5$3 in / $15 outsamesamesame
Haiku 4.5$1 in / $5 outsamesamesame

Legacy models

ModelPricing
Opus 4 / 4.1$15 in / $75 out per M tokens
Sonnet 4 / 3.7$3 in / $15 out

Key pricing notes:

  • Long context premium: When using the 1M token context window (beta) and exceeding 200K input tokens, rates increase to $10/$37.50 (Opus 4.6) and $6/$22.50 (Sonnet 4.6) per M tokens.
  • Fast mode (Opus 4.6): 6x standard rates for significantly faster output. Currently in research preview.
  • Prompt caching: Cache reads cost just 10% of base input price. Cache writes are 1.25x (5-min TTL) or 2x (1-hour TTL).
  • Batch API: 50% discount on all models for non-urgent workloads.
  • US-only inference: 1.1x multiplier on all token costs for Opus 4.6 and newer models.
  • Bedrock, Vertex, and Foundry bill in your account's currency and add standard cloud charges (data egress, networking, etc.). Anthropic confirms pricing parity across channels. Bedrock and Vertex offer global and regional endpoints, with regional endpoints including a 10% premium.

Which endpoint should you use?

NeedBest optionWhy
Lowest latency to US-EastAWS Bedrock (us-east-1)Dedicated Anthropic capacity inside AWS infra
Private networking / VPC-SCVertex AI Private Service ConnectKeeps traffic off the public internet
Azure-native billing and governanceMicrosoft FoundryMACC-eligible, Entra ID auth, Azure Monitor integration
Global multi-region failoverPortkey GatewayRoutes across Anthropic, Bedrock, Vertex, Foundry, and 250+ LLMs from a single endpoint
Advanced cost controlsPortkey budgets + prompt cachingEnforce spend limits, auto-downgrade models, per-team attribution

Portkey: One Switch for Anthropic, Bedrock, or Vertex

Direct Claude Code is great for solo hacking, but enterprises need:

  1. Unified gateway routing – point Claude Code to Portkey once; switch between Anthropic, Bedrock, Vertex, Foundry, or any other provider by changing a config value.
  2. Live spend & usage dashboards – 40‑plus metrics, cost attribution by team, and budget alerts.
  3. RBAC + guardrails – per‑role rate limits, banned‑pattern filters, and partner guardrails from Palo Alto, Prompt Security, Pangea, etc.
  4. Audit‑ready logs – every prompt, response, and error logged with metadata for compliance.

Bottom line: Portkey turns Claude Code into an enterprise‑grade service without forcing you to rebuild scripts or choose a single cloud. Drop your API key(s) into Portkey, set a config, and your devs keep typing claude …—now with observability, governance, and cloud‑agnostic resilience baked in.

If you're looking to implement Claude Code for your organisation with governance at scale, book a demo with us for a detailed walkthrough, or try it yourself!

💡
Keep Watching
Anthropic may refine quotas again once weekly limits ship, and Bedrock/Vertex regions keep expanding. Bookmark this post—we’ll update as soon as the numbers move.