What is OpenAI's Jalapeño chip?

Jalapeño is OpenAI's first custom-designed AI accelerator, classified as an "Intelligence Processor." Designed from a blank slate rather than adapted from general-purpose silicon, it targets LLM inference workloads: ChatGPT, Codex, the OpenAI API, and future agentic products. It was co-developed with Broadcom and manufactured through a partnership with Celestica for board and rack integration. Early lab tests run GPT-5.3-Codex-Spark at production-target frequency and power.

How fast was Jalapeño developed?

OpenAI and Broadcom completed design-to-tape-out in nine months, which OpenAI claims is the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors. OpenAI's own models were used to accelerate parts of the design and optimization process.

When will Jalapeño be deployed?

Initial deployment is targeted by the end of 2026 at gigawatt scale with Microsoft and other data center partners. Jalapeño is the first step in a planned multi-generation platform.

How does Jalapeño differ from Nvidia GPUs?

GPUs are general-purpose accelerators adapted for AI. Jalapeño is a blank-slate design built exclusively around LLM inference: the kernels, memory-movement patterns, networking topology, and serving patterns that matter for frontier models. OpenAI says early testing shows performance per watt substantially better than current state-of-the-art, though a detailed technical report will come in coming months.

Does this mean OpenAI is leaving Nvidia?

Not immediately. Jalapeño is targeted at inference, and OpenAI still uses Nvidia hardware for training. The company has described this as a long-term full-stack infrastructure strategy, not a short-term hardware swap. Multiple generations are planned.

OpenAI Jalapeño: First AI Chip for LLM Inference (2026) | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

OpenAI Jalapeño: First AI Chip for LLM Inference (2026) | explainx.ai Blog | explainx.ai

OpenAI and Broadcom (NASDAQ: AVGO) on June 24, 2026 unveiled Jalapeño, OpenAI's first custom AI accelerator—an Intelligence Processor designed from scratch for LLM inference, not adapted from a general-purpose GPU lineage. The chip was delivered to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom President and CEO Hock Tan and President Charlie Kawwas. Engineering samples are already running GPT-5.3-Codex-Spark at production-target frequency and power in the lab.

Primary sources:

OpenAI and Broadcom unveil LLM-optimized inference chip — official announcement, chip architecture rationale, partnership scope.
OpenAI X post (@OpenAI, Jun 24 2026) — public announcement thread.

Why OpenAI Built Its Own Silicon

The core argument Greg Brockman made in the announcement is simple: inference is where intelligence reaches people. Every ChatGPT response, every Codex task, every API call runs through inference hardware. If you own that hardware, you control cost, latency, and capacity in ways you cannot if you depend entirely on third-party silicon.

OpenAI has until now run on Nvidia GPUs—powerful, battle-tested, and expensive. GPUs were designed first for graphics, then adapted for matrix math, then further adapted for transformer models. Each layer of adaptation leaves efficiency on the table. Jalapeño starts from the premise: what if we designed a chip purely around the kernels, memory-movement patterns, and networking behaviors of modern LLMs?

Richard Ho, who leads OpenAI's hardware program, described it this way: "We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models."

Dimension	Training priority	Inference priority
Raw FLOPS	Maximize	Match to memory BW
Memory BW	High	Critical (weight streaming)
On-chip SRAM	Moderate	Large (KV cache)
Network latency	Tolerable	Low (request tail latency)
Power per token	Secondary	Primary (cost structure)

snippet

ChatGPT / Codex / API (Products)
    ↓
GPT-5.x / o-series (Models)
    ↓
Kernels / Serving / Routing (Software)
    ↓
Jalapeño + Tomahawk (Silicon)  ← new
    ↓
Data center (Microsoft, partners)

OpenAI Jalapeño: First AI Chip Built from Scratch for LLM Inference, Co-Developed with Broadcom

Why OpenAI Built Its Own Silicon

Related posts

OpenAI's First Hardware Device: Gurman Reports a Screenless AI Companion Speaker

Apple Sues OpenAI, Tang Tan, and Chang Liu Over AI Hardware Trade Secrets

NVIDIA DGX Spark: The Best Setup for Running Local LLMs in 2026

Nine-Month Tape-Out: A Record in Advanced Semiconductors

1. Deep Software-Hardware Co-Development

2. Broadcom's Silicon Implementation Expertise

3. OpenAI Models Accelerating Chip Design

Architecture: What "Blank Slate for LLM Inference" Means

Reducing Data Movement

Balancing Compute, Memory, and Networking

Networking via Broadcom Tomahawk

What GPT-5.3-Codex-Spark Running in Lab Means

The Full-Stack Strategy: Products → Models → Infrastructure → Silicon

Partner Roles: Broadcom and Celestica

Broadcom

Celestica

Gigawatt Scale: What That Actually Means

Implications for the AI Infrastructure Landscape

For OpenAI Users and Developers

For Nvidia

For Other AI Labs

For Broadcom's Business

Technical Deep Dive: Why Inference Silicon Differs from Training Silicon

Training vs. Inference: Different Bottlenecks

The KV Cache Problem

Networking for Disaggregated Inference

What This Means for AI Education and Skill Building

API Pricing Will Change

Infrastructure Ownership Is a Moat

The Full-Stack Flywheel Is Real

Learn the Stack, Not Just the API

What to Watch For