Vercel's May 2026 AI gateway data: Anthropic holds 65% of spend as DeepSeek surges to 17% of token volume
Vercel published its AI gateway production index for May 2026, drawing on routing traffic across its infrastructure to document a widening divergence in the AI provider market: DeepSeek's share of token volume jumped from under 1% to 17% in a single month, while Anthropic's share of total spend grew from 61% to 65% and held 70–80% of spend across every high-stakes use case category.
What's new
The index covers Vercel's AI gateway — a managed proxy layer routing requests across model providers for thousands of production applications. Key May 2026 figures:
- Total tokens: +20% month-over-month
- Total spend: +43% month-over-month (average cost per token up approximately 20% vs. April)
- DeepSeek: jumped from under 1% to 17% of token volume; share of spend stayed near 1%
- Anthropic: 65% of spend; "holding 70–80% of spend across every high-stakes use case"
- Coding agents specifically: DeepSeek = 49% of token volume / 4% of costs; Anthropic = 28% of tokens / 70% of costs
- At 1M+ request scale: most mature production applications route across 11 or more models
DeepSeek V4 Flash is priced at $0.14 input / $0.28 output per million tokens, described in the report as "roughly 20–50x lower than comparable Anthropic models," which explains the volume-spend divergence.
Context
The split between token volume and dollar spend reflects a real bifurcation in how production AI applications are assembled at scale. Cost-sensitive workloads — high-volume, lower-stakes tasks like code completion, summarization, and classification — route toward the cheapest capable model. High-trust workloads, where output quality directly affects user or customer outcomes, continue to favor Anthropic's Claude models despite the price differential.
Vercel's routing layer sits in a structurally useful position for this kind of data: it aggregates real production traffic across providers and use cases without the sampling bias of survey data or vendor self-reporting. The 11+ model routing pattern at 1M+ requests per month is a concrete indicator that model diversity is a production standard for mature AI applications, not a theoretical best practice.
The jump in DeepSeek volume tracks the June 2026 launch of DeepSeek V4 Flash and V4 Pro, which brought significantly lower pricing and improved performance metrics to the DeepSeek API.
Why it matters
For the frontier labs, Vercel's data is a dual-edged signal. Anthropic's spend dominance — 65% overall, 70–80% in high-stakes categories — reflects genuine developer trust in Claude for outputs where quality is the primary constraint. That is a durable competitive position as long as the trust holds.
But DeepSeek's token volume surge from near zero to 17% in a single month demonstrates how quickly cost-optimized alternatives can capture share in volume-sensitive workloads. For developers managing at-scale costs, routing DeepSeek for high-volume, low-stakes inference while reserving Anthropic for precision tasks is not a workaround — it is becoming standard practice.
The spend divergence also has implications for how AI revenue concentrates across providers. Token volume alone is no longer a reliable proxy for revenue share. Anthropic's ability to hold spend share while losing token share is a stronger commercial position than it might appear; DeepSeek's volume advantage does not translate to proportionate revenue at current pricing levels.
Vercel's monthly index is one of the cleaner public data series on AI gateway market dynamics. Each month's update provides a real-time window into cost-versus-quality segmentation as the model market evolves.
Corroborating sources
- Vercel
https://vercel.com/blog/ai-gateway-production-index-june-2026
“Anthropic's share of spend grew from 61% to 65% in May, holding 70–80% of spend across every high-stakes use case.”