What are TPU 8t and TPU 8i?

They are Google’s eighth-generation Tensor Processing Units, split by role: TPU 8t targets large-scale training (Google cites nearly 3× the compute per pod vs the previous generation, up to 9,600 chips and 2 PB of shared HBM in one superpod, 121 exaFLOPS FP4, Virgo networking). TPU 8i targets low-latency inference and multi-agent “swarm” workloads (Google cites 1,152 TPUs in a single pod, 3× on-chip SRAM vs prior, Boardfly topology, and about 80% better performance per dollar vs the previous generation for that line). See Google’s infrastructure blog and Sundar Pichai’s Next post for the official wording.

What is the Gemini Enterprise Agent Platform?

Google positions it as the evolution of Vertex AI into a full developer surface for building, governing, and scaling agents—Model Armor, Agent Identity, Agent Gateway, partner agents in a gallery, MCP support, simulation tooling, and integration with the Gemini Enterprise app. Official overview: The new Gemini Enterprise — Google Cloud blog, April 22, 2026.

Where do the 75% AI code and 16B tokens per minute numbers come from?

Sundar Pichai’s Google blog post (Cloud Next 2026) states 75% of new code at Google is AI-generated and engineer-reviewed, up from 50% the prior fall, and that Google’s first-party models process more than 16 billion tokens per minute via direct customer API use, up from 10 billion the prior quarter. A separate Google Cloud welcome post also cites the 16B+ tokens/minute figure in customer context.

Is this investment advice about Alphabet stock?

No. This article summarizes product and infrastructure news. Equity markets move for many reasons; we do not forecast prices.

Google Cloud Next 2026: TPU 8t / TPU 8i, Gemini Enterprise Agent Platform, and the “agentic enterprise” | explainx.ai Blog

Google Cloud Next ‘26 (Las Vegas, April 2026) packaged one narrative: custom TPUs, Gemini (and partner models in places), and Gemini Enterprise as the end-to-end “agentic enterprise” layer. The same story echoed across X and trade press; the notes below are tied to primary Google and Google Cloud posts—not Grok- or second-hand summaries alone.

Read first (official, in order):

Sundar Pichai — Google Cloud Next 2026 — TPU 8t/8i, 75% AI-generated new code (engineer-approved), 16B+ tokens/minute, Wiz, CapEx.
Our eighth generation TPUs: two chips for the agentic era — Amin Vahdat; full 8t/8i technical story.
The new Gemini Enterprise: one platform for agent development — Agent Platform, app, partners.
Google Cloud Next ‘26 — news and updates and Welcome to Google Cloud Next26 — customer + token scale.

TPU 8t (training) and TPU 8i (inference)

Per Google’s TPU 8 post, 8t and 8i are purpose-split for the agent era: training needs huge scale-up; inference needs memory bandwidth, low latency, and efficiency when many small steps chain together.

TPU 8t (training):

~3× compute per pod vs the prior generation (Google names Ironwood in the same post).
Up to 9,600 chips and 2 petabytes of shared HBM in a superpod; 121 exaFLOPS FP4; 2× interchip bandwidth; 10× faster storage to the fabric (TPUDirect); Virgo and JAX / Pathways for large jobs. Pichai’s shorter post also references scaling to on the order of one million 8t chips in one logical cluster for frontier training.
Google targets >97% “goodput” (productive training time) via RAS, rerouting, and OCS.

TPU 8i (inference):

Pichai: 1,152 TPUs in one 8i pod; 3× more on-chip SRAM than the prior generation; aimed at latency-sensitive and agent workloads.
The long TPU post adds 288 GB HBM and 384 MB on-chip SRAM per chip, doubled Axion hosts, Boardfly topology, a Collectives Acceleration Engine, and about 80% better performance per dollar vs the prior inference generation—Google’s own efficiency claim, not a cross-vendor GPU benchmark.

GA: public posts say later in 2026; request TPU information for commercial follow-up.

Gemini Enterprise Agent Platform and ecosystem

The new Gemini Enterprise (April 22, 2026) frames Gemini Enterprise as end-to-end: models, product surfaces, governance, and deployment. The Agent Platform evolves Vertex into a build/tune/govern stack with MCP support, Model Armor, Agent Identity, paths into the Gemini Enterprise app, and a governed agent gallery for employees.

Partner agents name Oracle, Salesforce, ServiceNow, Workday, Adobe, Accenture, and others. Salesforce and Google’s joint PR (Cloud Next ‘26) covers Agentforce Sales in Gemini Enterprise and cross-platform Slack / Workspace work. Workspace Intelligence and Agentic Data Cloud are summarized in the Next ‘26 news post.

75% new internal code, 16B+ tokens per minute, customer scale

From Pichai’s blog: 75% of new code at Google is AI-generated and engineer-approved (up from 50% last fall); a separate item highlights agentic workflows and a 6× faster complex migration vs a year prior; and first-party Cloud models process more than 16 billion tokens per minute from direct customer API use, up from 10 billion the prior quarter.

Next ‘26 and the Cloud welcome add adoption scale: e.g. nearly 75% of Google Cloud customers using AI products; 330 customers each with >1 trillion tokens in 12 months; 35 above 10 trillion. Business press also tied the event to Alphabet share moves; this post is not financial advice.

ExplainX: multicloud, skills, and verification

Google is packaging silicon, models, governance, and SaaS partners in one story—compelling for GCP-centric shops.
Portable patterns still matter: agent skills, MCP, and explainx.ai/skills help connectors and workflows survive model and host changes.
The 75% figure is a process metric at one company; pair any keynote stat with hallucination literacy and your own evals.
Courses teach the same primitives: trust boundaries, registries, and verification first.

SKUs, dates, and claims evolve. Re-verify on Cloud Next and product pages before plans or procurement.

Google Cloud Next 2026: TPU 8t / TPU 8i, Gemini Enterprise Agent Platform, and the “agentic enterprise”

TPU 8t (training) and TPU 8i (inference)

Gemini Enterprise Agent Platform and ecosystem

75% new internal code, 16B+ tokens per minute, customer scale

ExplainX: multicloud, skills, and verification

Related posts

When AI token spend stops looking like “another SaaS line item” (Ramp data and what to do about it)

Skills in Chrome: Google turns saved Gemini prompts into one-click workflows

Interpretability, monitoring, and what teams can do without solving alignment