In 2026 you can read two parallel stories on X: “token improvement plan” jokes, and real CFO threads about inference bills that show up in invoices, reimbursements, and API keys that never reconcile to a single dashboard. The defensible public signal is not a social summary alone; it is spend infrastructure companies publishing transaction- and token-level trends. Below: Ramp’s 2026 primary sources, why coding agents compound cost, a grounded read on salary-sized bill memes, and ExplainX-style governance and engineering habits.
What Ramp publishes (primary)
1. Thirteenfold growth in average monthly token spend (Jan 2025 → 2026). In The $1 trillion AI spend blind spot (Apr 9, 2026), Ramp states that “Since January 2025, average monthly AI token spend across Ramp customers has increased 13x” and stresses “Not 13%. Thirteen times.” The post argues finance needs dollars and attribution (team, model, use case), not just provider telemetry.
2. Lumpy, heavy tails for top spenders. The same post says “the biggest AI spenders see costs jump 50% or more roughly one in four months.” The tokenmaxxing economy (Apr 15, 2026) echoes the 1-in-4 month spike along with the >50% of businesses on Ramp paying for AI in their AI Index milestone.
3. Shadow and card spend still matter. The blind-spot article describes SaaS sprawl, reimbursements, and late invoices—the same governance gap behind CFO jokes about the “AI budget” on X.
Caveat: figures are from Ramp’s base; your vendors, plans, and API vs. chat mix will differ. Treat the 13× and 50% spike rates as order-of-magnitude planning signals, not a promise on your next invoice.
Why agentic coding burns more than “chat for slides”
- Output usually costs more per million tokens than input; agent loops (retries, tool calls, sub-agents) multiply billable completions—see Caveman, token economics, and agent pipelines.
- Repo-scale context in Claude Code-style workflows means large reads on each turn unless you cache and structure context; see what are LLM tokens?.
- Pre-merge cloud reviews such as /ultrareview are priced as extra usage after free trials—another line item beyond the $20 seat.
- Leadership may read high adoption as productivity up; finance needs the same usage tied to shipped outcomes, not vibes alone.
Podcast “$300 per day per agent” vs macro data
Podcast and investor anecdotes (e.g. on the order of $300/day in API spend for a relentlessly driven agent—ballpark $100K/year in envelope math) illustrate extreme API-heavy patterns; they are not a BLS stat or a universal per-engineer floor. The serious claim is narrower: at some usage densities, inference + tools enter the same budget conversation as headcount—which is one reason Ramp sells reconciliation and attribution to finance teams.
If the CFO asks whether you are token-poor or process-poor, start with a ledger, not a new model name.
ExplainX: habits that actually bend the curve
- Instrument and label—per team, project, and model on API keys; match invoices to metered usage monthly.
- Engineer for lean context—cache, RAG, smaller models for scaffold work; break retry loops with tests and review (again Caveman).
- Encode repeatable work in agent skills and MCP so you re-type less and waste fewer tokens on boilerplate (skills guide, MCP explainer).
- Govern agents as you would any supply chain—skills and security; treat unbounded autonomy as a batch job with a budget.
Read next: Caveman · Claude Code Pro vs Max and pricing reality · Why models hallucinate
Figures and product names change. Re-check Ramp’s post and leading indicators. This is not tax, legal, or investment advice.