GPT-5.6 Guide: Sol, Terra, Luna Models, Pricing, and Benchmarks
GPT-5.6 Sol, Terra, Luna officially previewed June 26, 2026. Fable 5 live again July 1. GPT-5.6 broad GA in coming weeks. API pricing, Terminal-Bench scores, and tier comparison vs Fable 5.
July 1, 2026 update — Fable 5 is live; GPT-5.6 GA next: Commerce lifted Fable/Mythos export controls June 30; Anthropic restored globally July 1 (status). GPT-5.6 limited preview continues — broad GA in coming weeks. OpenAI launched GPT-5.6 Sol, Terra, and Luna June 26. Sol Ultra leads Terminal-Bench 2.1 at 91.9%.
TL;DR (official):
Model
Role
API (in/out per 1M)
Terminal-Bench 2.1
Sol Ultra
Subagent ultra mode
—
91.9%
Sol
Flagship
$5 / $30
88.8%
Terra
GPT-5.5-class, 2× cheaper
$2.50 / $15
82.5%
Luna
Volume tier
$1 / $6
84.3%
GPT-5.5
Prior flagship
$5 / $30
88.0%
Claude Fable 5
Live July 1
$10 / $50
83.4%
Walkthrough of the official GPT-5.6 preview — Sol, Terra, Luna tiers, benchmarks, and access timeline.
Before June 26, GPT-5.6 surfaced only through indirect signals. Most proved directionally correct:
Codex log traces: Developers using Codex Computer Use have reported model identifiers referencing "gpt-5.6" appearing in system-level logs during extended agentic sessions. These are not publicly documented model names.
Context window reports: A subset of ChatGPT Pro OAuth users invoking Codex in extended sessions have reported context windows exceeding 1.4–1.5 million tokens—substantially above GPT-5.5's reported capabilities—in unofficial early-access configurations.
OpenAI's release cadence: OpenAI shipped GPT-5.4 in March 2026 and GPT-5.5 on April 23, 2026. The company's documented pattern of sub-60-day incremental model releases puts the next model firmly in June 2026. Prediction market traders on Polymarket and Metaculus have priced in 80–89% odds of a public release by June 30.
Training data signals: Researchers analyzing GPT-5.6 responses in early access have noted knowledge of events through approximately May 2026—consistent with a refreshed training cutoff ahead of a June public release.
Leaks correctly pointed at June timing, agentic gains, and government gating. OpenAI's official framing for Sol is stronger than "incremental" — a step function over GPT-5.5 on frontier agentic work — while Terra and Luna handle cost-optimized tiers.
Confirmed improvements (Sol vs GPT-5.5)
OpenAI's official claims concentrate on agentic, biology, and cyber — not single-turn chat polish:
Hands-on with GPT-5: where it exceeds expectations and where it still falls short.
1. Context Window: Up to 1.5 Million Tokens
GPT-5.5 operates with a context window that most production applications have treated as ~400K tokens effective for complex tasks. GPT-5.6 is expected to push this to approximately 1.5 million tokens—a 43% increase over the developer-reported ceiling for 5.5.
Why this matters: long-context handling is one of the clearest capability signals in the current frontier race. Claude Fable 5 and Gemini 3.1 Pro have both pushed long-context as a differentiator. A 1.5M token GPT model changes the calculus for use cases like full-codebase analysis, book-length document review, and multi-session agent state persistence.
At 1.5M tokens you can fit roughly:
An entire mid-size software project's worth of source code
A legal document corpus for a full case discovery process
Several full academic papers plus all their cited sources
The most technically significant expected improvement is in multi-hour agentic task completion rates—particularly for Codex Computer Use workloads where an AI agent plans, executes, debugs, and iterates on a task autonomously over extended time horizons.
GPT-5.5 made progress here with its 82.7% Terminal-Bench 2.0 score, but early reports suggest GPT-5.6's agentic reliability improvement is meaningful enough that developers noticed it without being told the model changed. The improvement is attributed to:
A cleaner reward signal in training that reduces reward hacking in long agent loops
Tighter persona-isolation (the model less frequently "breaking character" or contradicting its system prompt mid-task)
An improved SFT pipeline that doesn't recycle contaminated rollouts—a subtle but important training quality fix that affects how reliably the model follows complex multi-step instructions
For developers building with Codex or custom agent frameworks, this kind of reliability improvement matters more than raw benchmark scores. A 10% improvement in task completion rate on a 20-step agent pipeline means the agent succeeds more than twice as often end-to-end.
3. Refreshed Training Data Through Mid-2026
GPT-5.5 launched in April 2026 with a training cutoff that left a gap for events from early 2026 onward. GPT-5.6 is expected to include training data through approximately May 2026, closing this window.
For most tasks, training cutoff doesn't matter. For tasks involving recent software ecosystems (new library releases, framework updates), recent world events, or current competitive intelligence, a model trained 6–8 weeks more recently is meaningfully more useful.
4. FrontierMath Tier 4 Reasoning
GPT-5.5 posted 35.4% on FrontierMath Tier 4—the hardest mathematical reasoning benchmark. GPT-5.6 is expected to show improvement here, potentially pushing past 40%. This would be the most direct counter to OpenAI's o3-pro positioning as the reasoning-first model: if GPT-5.6 meaningfully improves frontier math without being explicitly a "reasoning model," it blurs the product line distinction.
5. Token Efficiency for Long Tasks
For long-running agentic sessions, GPT-5.6 reportedly uses fewer tokens to accomplish the same work—a result of the cleaner SFT pipeline reducing repetition, self-correction loops, and unnecessary verbosity. For API users with high-volume agentic workloads, this efficiency gain translates directly to lower cost even if per-token pricing stays the same.
GPT-5.6 family vs GPT-5.5: The upgrade picture
Capability
GPT-5.5
GPT-5.6 Sol (official)
Terra / Luna
Terminal-Bench 2.1
88.0%
88.8% (Ultra 91.9%)
82.5% / 84.3%
API input / output
$5 / $30 per 1M
$5 / $30
$2.50/$15 · $1/$6
Agentic modes
Standard
Max reasoning · Ultra subagents
Tiered
GeneBench v1
Baseline
Better, fewer tokens
Improved cyber stack
Availability
Public
Preview → GA weeks
Same
Routing rule: Sol for hardest agent work; Terra when GPT-5.5-class is enough at half cost; Luna for volume. Context-window leak numbers (~1.5M) were not in OpenAI's June 26 preview post — treat as unconfirmed until GA docs.
GPT-5.6 vs Claude Fable 5: The Frontier Battle
This is the comparison that makes GPT-5.6 interesting. Claude Fable 5 ($10/$50 per million tokens) has been Anthropic's dominant position at the frontier since its launch: highest per-token price, highest capability ceiling, the model Claude Code runs on for complex agent tasks.
Context length: Fable 5 has a 200K context window—a standard frontier spec. A GPT-5.6 at 1.5M tokens would be a 7.5× advantage on this single dimension. For use cases that push context limits, GPT-5.6 would win outright.
Agentic coding: Fable 5 leads the frontier on long-horizon autonomous coding tasks. GPT-5.6's reported improvements in multi-hour task completion rates are specifically targeting this category. Whether the gap closes entirely depends on benchmark results, but OpenAI is clearly aiming at Fable's core strength.
Pricing: Claude Fable 5 at $10/$50 per million tokens is 2× GPT-5.5's pricing. If GPT-5.6 stays near GPT-5.5's price point, it creates a scenario where a model with comparable or better capability costs half as much—which would reshape which frontier model enterprises default to.
Multimodal: Fable 5 is strong on multimodal reasoning. GPT-5.5 Vision already competes here, and GPT-5.6 is expected to maintain or improve that standing.
Single-turn quality: Fable 5 leads on the Artificial Analysis Intelligence Index and closely-contested benchmarks like SWE-bench Verified (87% range). GPT-5.6 is not expected to dramatically change this competitive position—Anthropic's RLHF quality at the fine-tuning stage is a real advantage.
The honest prediction: GPT-5.6 probably ties Fable 5 on aggregate intelligence metrics and leads Fable 5 on context length. On the hardest agentic coding tasks at the absolute frontier, whether GPT-5.6 closes Fable 5's lead depends on benchmark results that don't exist yet.
What's notable is how close this matchup is expected to be. Six months ago, Claude Fable 5 was a clear tier above GPT-5.5 on agentic capability. GPT-5.6's reported improvements would make this a genuine coin-flip race rather than a clear hierarchy.
GPT-5.6 vs Claude Fable 5: Quick Comparison
Dimension
GPT-5.6 (expected)
Claude Fable 5
Input price (per 1M)
~$5.00–$6.00
$10.00
Output price (per 1M)
~$30.00–$35.00
$50.00
Context window
~1.5M tokens
200K tokens
SWE-bench Verified
~87–89% (estimate)
~87%
Agentic task completion
Improved (TBD)
Strong
FrontierMath Tier 4
~40% (estimate)
~36% (estimate)
Training cutoff
~May 2026
~Mar 2026
Multimodal
Strong
Strong
Self-hosting
No
No
At these expected specs, the pricing story is significant: if GPT-5.6 delivers frontier-comparable capability at roughly half the per-token cost of Fable 5, the enterprise default for high-volume agentic workloads shifts. Teams spending $50,000/month on Fable 5 could potentially run the same workloads on GPT-5.6 for $25,000–$30,000.
What This Means for Developers Right Now
If you're currently on GPT-5.5: The upgrade case for GPT-5.6 is strong if you're doing agentic work or long-context tasks. For single-turn quality, the upgrade is marginal—you can wait for benchmark confirmation before migrating.
If you're currently on Claude Fable 5: Watch the first independent benchmark results closely when GPT-5.6 launches. The context window advantage alone (1.5M vs 200K) is material for certain workloads. On coding benchmarks, if GPT-5.6 matches Fable 5 at roughly half the price, the ROI calculation for high-volume use cases changes.
If you're building something new: Hold off on committing to either model until GPT-5.6 official benchmarks are published. A model at GPT-5.5's price point with Fable 5-class capability changes the math significantly.
If you're considering local open-source models: GPT-5.6 and Claude Fable 5 competing on context window and agentic capability doesn't change the underlying economics for the 70–80% of tasks where open-weight models like Qwen3 235B or DeepSeek V3 are already good enough. The frontier race is relevant for the hardest agentic and reasoning tasks; most practical workflows are better served by matching the right open model to the task.
The Bigger OpenAI Cadence Picture
GPT-5.6 is not an event—it's a data point in a pattern. OpenAI has compressed its release cadence to under 60 days between incremental model updates. This means:
GPT-5.5 is already ~8 weeks old at the expected GPT-5.6 release date
A GPT-5.7 would be expected in August 2026
The frontier model you adopt in January may be two model generations behind by July
This cadence creates a different kind of lock-in pressure than before. Rather than committing to a model and trusting it for a year, enterprise AI teams are now managing rolling model upgrades, regression testing, and prompt compatibility across quarterly update cycles.
The teams managing this most effectively in 2026 are those with model abstraction layers in their AI infrastructure—routing specific task types through specific models and swapping models at the routing layer without rewriting application logic. Whether GPT-5.6 beats Fable 5 matters less if your architecture allows you to swap in the winner within a week of benchmark publication.
Timeline: What to watch for
June 26, 2026 — limited preview live (Codex + API, trusted partners).
Coming weeks — OpenAI plans general availability (ChatGPT, Codex, API) and an expanded benchmark suite.
July 2026 — Cerebras Sol deployment (up to 750 tps) for select customers.
Parallel track — U.S. cyber EO framework and repeatable release process; OpenAI says it does not want permanent per-customer government approval.
Until GA, most users stay on GPT-5.5. When your tier unlocks, benchmark Terra for cost and Sol for agentic pipelines — see comparison vs Fable 5.
Updated June 27, 2026 with OpenAI's official GPT-5.6 preview announcement. Pre-release leak sections retained for context. Verify availability on openai.com. Full launch guide: GPT-5.6 Sol, Terra, Luna.