explainx.ainewsletter3.4k
trending🔥loopsskills
pricing
workshops ↗
explainx.ai

Learn to lead teams that combine humans and agents. Platform access, live workshops, bootcamps, and 50+ courses — plus skills, tools, and MCP to practice what you learn.

follow us

custom AI agents

[email protected]

get started

Join · $29/mo

learn

start for freepathwaysworkshopsbootcampscoursescertificationscertification testsexplainx universitycorporate trainingfacilitatorshackathonslearn skills & mcp

discover

skillstoolsagentsmcp serversdesignsllmsagiranks

content

releasesvisionmissionaboutcommunityteamcareersresourcespromptsgenerators hubgenerator SEO hubprompt templatesprompt guidesblogfor LLMsdemo

Sister Products

Infloq

Infloq

Influencer marketing

BgBlur

BgBlur

Privacy-first blur

Olly Social

Olly Social

Social AI copilot

Ceptory

Ceptory

Video intelligence

BgRemover

BgRemover

Background removal

newsletter · weekly

Get AI news, tools, and insights in your inbox.

contactsupportprivacytermsdata rightssubmission guidelines

© 2026 AISOLO Technologies Pvt Ltd

← Back to blog

explainx / blog

Grok 4.5 Enters Private Beta at SpaceX and Tesla — Built on 1.5T V9 Model With Cursor Data

Elon Musk announces Grok 4.5, trained on the 1.5T V9 foundation model with Cursor coding data, is now in private beta at SpaceX and Tesla. Early evals show performance close to or exceeding Claude Opus. Here's what we know.

Jun 28, 2026·7 min read·Yash Thakker
Grok AIxAIElon MuskFrontier ModelsAI RaceCoding AI
Grok 4.5 Enters Private Beta at SpaceX and Tesla — Built on 1.5T V9 Model With Cursor Data

TL;DR

On June 28, 2026, Elon Musk announced that Grok 4.5 — built on xAI's 1.5T V9 foundation model with Cursor IDE coding data added in supplemental training — has entered private beta at SpaceX and Tesla. Early internal evaluations show performance "close to, perhaps exceeding" Anthropic's Claude Opus. Reinforcement learning is ongoing, and the Grok Build harness is showing daily improvements. SpaceX also plans to ship completely new models from scratch monthly for the rest of 2026.


What Musk Announced

The announcement came directly from Elon Musk on X:

"Grok 4.5, based on our 1.5T V9 foundation model, with Cursor data added in supplemental training, is now in private beta at SpaceX & Tesla. Early evals show performance close to, perhaps exceeding Opus. RL is continuing to significantly improve the model, and the Grok Build harness is showing daily advancements."

Three details here are worth unpacking separately:

  1. The 1.5T V9 foundation model — xAI's underlying architecture, now at 1.5 trillion parameters
  2. Cursor data in supplemental training — coding interaction data from one of the most popular AI IDEs
  3. Opus as the benchmark — Claude Opus is Anthropic's most capable reasoning model, the bar Grok 4.5 is being measured against

Why Cursor Data Matters

Cursor is an AI-native IDE used by hundreds of thousands of developers. When xAI says they added "Cursor data" in supplemental training, they almost certainly mean real developer interaction data — how engineers actually prompt AI to write code, debug issues, review diffs, and build software end-to-end.

This is a fundamentally different signal than synthetic benchmarks. Real Cursor sessions capture:

  • Agentic multi-turn workflows — a developer instructs the model, sees output, corrects it, iterates
  • Context window pressure — large codebases that stress memory and retrieval
  • Production code patterns — not toy examples, but real-world TypeScript, Python, Rust, Go
  • Error recovery — how models handle and fix compilation errors, test failures, and runtime issues

For coding AI benchmarks, this kind of data is gold. It's why models trained on real-world coding interactions consistently outperform those trained purely on static code corpora.

Compare this to how Claude models are benchmarked on SWE-Bench and DeepSWE — real software engineering tasks that require multi-step agentic reasoning. Grok 4.5 appears to be targeting exactly this category.


The V9 Foundation Model: What We Know

The 1.5T V9 designation tells us xAI is operating at the upper end of parameter scale. For context:

ModelParameters (approx.)
Grok 4.5 (V9)1.5T
GPT-5.6Not disclosed
Claude Fable 5Not disclosed
DeepSeek V4 Pro~671B (MoE)

Large dense parameter counts are not always better than sparse Mixture-of-Experts architectures — DeepSeek V4 Pro demonstrated that MoE efficiency can match or beat dense models at a fraction of the compute. But paired with quality training data (including Cursor) and ongoing RL, a 1.5T dense model has enormous headroom.


Grok Build Harness

Musk referenced "daily advancements" in the Grok Build harness — xAI's internal training and evaluation pipeline for agentic tasks. This is xAI's equivalent of the harness-based evaluation systems that frontier labs use for agent benchmarks.

A build harness typically runs the model against a suite of agentic tasks — write code, run it, check output, fix bugs — in an automated loop. Daily advancements suggest xAI is in an active RL training phase where the model is improving rapidly on this task distribution.


SpaceX and Tesla as Private Beta Environments

Choosing SpaceX and Tesla as the beta environments is deliberate. Both companies have massive internal software engineering needs:

  • SpaceX: Flight software, simulation, avionics, embedded systems, data pipelines for Starship and Starlink
  • Tesla: Autopilot/FSD codebases, manufacturing automation, energy management, Dojo supercomputer software

These are not standard enterprise software stacks. They involve safety-critical code, unusual hardware constraints, and domain-specific requirements. Testing Grok 4.5 in these environments gives xAI access to production-grade evaluation at scale — far harder than standard coding benchmarks.


How It Compares to Claude Opus

Musk's claim that Grok 4.5 is "close to, perhaps exceeding Opus" needs context.

Claude Opus (part of the Fable 5 family) is Anthropic's most capable reasoning model, known for:

  • Long-horizon multi-step reasoning
  • Precise tool use and code analysis
  • Strong performance on agentic benchmarks
  • The foundation for Claude Mythos' security capabilities

The early independent reaction on X aligned with Musk's claim. Developer Mehul Mohan, who tested an early build, described the vibes as "similar to Opus." This is anecdotal but consistent with the internal eval framing.

What remains unverified: public benchmark scores on SWE-Bench, HumanEval, GPQA, or any of the standard evaluation suites that allow direct comparison.


Monthly New Models from SpaceX Through 2026

Perhaps the most ambitious part of the announcement is buried in the context: SpaceX plans to release completely new models trained from scratch every month for the rest of 2026.

This is a remarkable cadence. Training a 1.5T model from scratch takes significant compute and time even for a well-resourced lab. If accurate, it implies xAI has:

  • Sufficient GPU capacity (likely Colossus cluster) to run parallel training runs
  • A streamlined data pipeline that can turn around new training datasets monthly
  • Confidence that the Grok Build RL harness can rapidly improve each base model post-training

Monthly new model releases would put xAI on a faster iteration cycle than any other frontier lab has publicly committed to.


What This Means for the AI Race

Grok 4.5 is the latest signal in what has become an extraordinarily compressed AI race in 2026. Earlier this year:

  • DeepSeek V4 Pro disrupted pricing expectations
  • GLM-5.2 from Zhipu reportedly matched Claude Mythos on security benchmarks
  • Claude Fable 5 launched with Anthropic's biggest capability leap yet
  • GPT-5.6 pushed OpenAI's frontier further
  • Alibaba's Qwen 3.7-Max set new records on long-horizon agent benchmarks

Grok 4.5 positions xAI as a genuine player in the top tier — not just a social media AI, but a model targeting the most demanding agentic coding tasks in production environments.

For developers, the practical implication is that Opus-class coding capability may soon be available from multiple providers, increasing competition and likely driving down costs.


What to Watch

  1. Public benchmark release — Will xAI publish Grok 4.5 scores on SWE-Bench, HumanEval, or GPQA before the public launch?
  2. Cursor integration — Given the Cursor training data angle, will xAI partner with Cursor or release Grok 4.5 as a selectable model in the IDE?
  3. Polymarket probability shift — The current 14% chance of a non-US lab leading AI by year-end is a market signal. A public Grok 4.5 release matching Opus would shift US-lab probabilities, not diminish them.
  4. Monthly model cadence — Can SpaceX actually ship a new foundation model every month? The first few releases will test that claim.
  5. Open weights possibility — No mention of open weights, but xAI has released open Grok models before. If V9 weights drop, the developer ecosystem impact would be enormous.

Bottom Line

Grok 4.5 entering private beta at SpaceX and Tesla is a credible frontier-model announcement. The combination of a 1.5T parameter base, real-world Cursor interaction data, ongoing RL improvements, and production testing in safety-critical environments is a serious technical approach — not just a benchmark chase.

Whether it truly matches or exceeds Claude Opus won't be known until independent benchmarks surface. But the direction is clear: xAI is targeting the same agentic coding and reasoning niche that Anthropic, OpenAI, and DeepSeek are all competing in — and doing it with access to production environments no other lab can replicate.

Further reading:

  • Claude Fable 5 and Mythos 5 launch
  • DeepSeek V4 Pro benchmarks and pricing
  • AI benchmarks complete guide 2026
  • Claude Code vs Codex vs Gemini CLI vs GLM-5.2
  • Zhipu AI matches Claude Mythos on security bugs

Reported based on Elon Musk's announcement on X as of June 28, 2026. Independent benchmark verification of Grok 4.5's performance claims was not available at time of publication.

Related posts

Jun 28, 2026

Zhipu AI's New Model Matches Claude Mythos on Security Bug Detection — What It Means for the AI Race

Chinese AI lab Zhipu AI has released a model that reportedly matches Claude Mythos' performance at finding security bugs — reigniting the debate on whether Chinese labs are closing the gap with Western frontier models.

Jun 24, 2026

Elon Musk's X Bio Says "Starmind" — What Does It Mean?

One word. 240 million followers. Elon Musk's X bio read "Starmind" — and hours later SpaceX confirmed it as the name of a planned constellation of up to one million AI satellites that run inference in orbit. Here's the full story.

Jun 17, 2026

Grok Imagine Video 1.5 Is Here: xAI's #1 Image-to-Video Model with Native Audio (2026)

Grok Imagine Video 1.5 beats Sora 2, Veo 3.1, and Kling in blind user benchmarks, generates synchronized audio in a single pass, and costs 86% less than OpenAI's equivalent. Here is what it does, how it compares, and where it still falls short.