On June 3, 2026, Mandy Lu—a Google AI researcher with a Stanford PhD in computational mathematics—posted a simple statement on X (formerly Twitter) that ignited a viral debate:
"we still have no satisfying theory for why AI works"
The post struck a nerve. With over 80 reactions and hundreds of replies, it exposed an uncomfortable truth in the AI research community:
Despite transformers powering ChatGPT, Codex, and every major AI breakthrough since 2017, no one fully understands why they work so well.
We have scaling laws that predict performance. We have mechanistic interpretability that maps features. We have emergent abilities that appear unpredictably.
But we don't have a unified theory explaining why massive models trained on internet-scale data can reason, code, and create.
It's like building a rocket to the moon using thermodynamics—but without understanding atoms.
TL;DR
| Topic | Key Facts |
|---|---|
| The Problem | No satisfying theory explains why transformers trained on massive datasets perform reasoning, coding, and creative tasks so effectively. |
| Scaling Laws | Predict performance gains from compute, data, and parameters—but not the underlying reasons why scaling works. |
| Emergent Abilities | Capabilities (reasoning, in-context learning) appear in large models unpredictably; some debate if emergence is real or measurement artifact. |
| Mechanistic Interpretability | Reverse-engineers neural networks to map features and pathways; named MIT's 2026 Breakthrough Technology. |
| Current State | AI works empirically (thermodynamics) but lacks fundamental theory (atomic physics). Practical results prioritized over theory. |
| Debate Context | Sparked by Mandy Lu (Google AI, Stanford PhD) on X; reflects broader tension in AI research community. |
Who is Mandy Lu?
Mandy Lu is an AI researcher at Google working on health and climate applications. She holds a PhD in Computational and Mathematical Engineering from Stanford University, admitted Autumn 2025.
Her academic background includes:
- Dual bachelor's degree in Math and Computer Science from Stanford
- Master's degree with concentration in AI from Stanford
- Research in the Stanford Vision Lab advised by Prof. Fei-Fei Li and Prof. Juan Carlos Niebles
- Work in the Computational Neuroscience Laboratory under Prof. Ehsan Adeli and Prof. Kilian Pohl
Her research projects focus on:
- Using computer vision to develop systems for Parkinson's disease assessment
- Developing statistical techniques to reduce the effect of confounding variables on ML models
Her interdisciplinary background in computational mathematics, neuroscience, and AI makes her uniquely positioned to identify gaps between empirical results and theoretical understanding.
The Statement That Sparked the Debate
On June 3, 2026, Lu posted:
"we still have no satisfying theory for why AI works"
The simplicity and directness of the statement resonated across the AI research community, tech workers, and skeptics alike.
Replies ranged from:
- Technical explanations of scaling laws and mechanistic interpretability
- Comparisons to thermodynamics before atomic theory
- Debates over whether practical results matter more than theory
- Criticisms that AI is overhyped and doesn't actually "work" as claimed
The discussion exposed a fundamental tension in AI research: we can build increasingly powerful systems, but we can't fully explain them.
What We Know: The Empirical Evidence
Before exploring what we don't understand, let's establish what we do know.
1. Transformers Work Empirically
Since the "Attention Is All You Need" paper (Vaswani et al., 2017), transformers have dominated AI:
- Large Language Models (GPT-4, Claude, Gemini)
- Vision Models (CLIP, SAM, DALL-E)
- Multimodal Models (GPT-4o, Gemini 1.5)
- Code Generation (Codex, GitHub Copilot)
- Scientific Discovery (AlphaFold 3, protein design)
The architecture works—undeniably and reproducibly.
2. Scaling Laws are Predictable
Research from OpenAI (2020) and DeepMind's Chinchilla (2022) established that model performance follows power-law relationships with:
- Model size (number of parameters)
- Dataset size (number of training tokens)
- Compute (FLOPs during training)
You can predict the performance of a 100B parameter model trained on 2T tokens before you train it.
3. Emergent Abilities Appear
As models scale, new capabilities appear that weren't present in smaller models:
- In-context learning (learning from examples in the prompt)
- Chain-of-thought reasoning (step-by-step problem solving)
- Instruction following (understanding and executing complex requests)
- Multi-step planning (breaking down tasks)
These abilities emerge at certain scale thresholds—but we can't predict exactly when or why.
4. Mechanistic Interpretability is Advancing
Researchers can now:
- Identify features corresponding to recognizable concepts (Anthropic's "dictionary learning")
- Trace pathways a model takes from prompt to response
- Intervene on specific attention heads to control model behavior
- Map geometric representations of knowledge in high-dimensional space
MIT Technology Review named mechanistic interpretability one of its 10 Breakthrough Technologies for 2026.
What We Don't Know: The Theory Gap
Despite these advances, fundamental questions remain unanswered.
1. Why Does Self-Attention Work So Well?
The self-attention mechanism is the core of transformers. It allows the model to:
- Look at every token in a sequence simultaneously
- Compute relationships between all tokens
- Parallelize computation across GPUs
We know how it works mathematically. We can implement it. We can optimize it.
But why does this particular mechanism enable reasoning, creativity, and generalization?
2. Why Do Scaling Laws Hold?
Scaling laws are descriptive, not explanatory.
They tell us:
- "If you 10x the compute, performance improves by Y%"
They don't tell us:
- "Why does more compute lead to better reasoning?"
- "What changes in the model's internal structure as it scales?"
- "Why do power laws govern AI performance?"
As one researcher put it: "Scaling laws are like the ideal gas law. They predict behavior, but they're not a fundamental theory."
3. Why Do Emergent Abilities Appear?
Emergent abilities are the most mysterious phenomenon in AI.
At some scale threshold, models suddenly:
- Learn to follow multi-step instructions
- Perform arithmetic they weren't explicitly trained on
- Reason through novel problems
Why?
Some researchers argue emergence is real—a phase transition in the model's internal representations.
Others argue it's a measurement artifact: we're using crude metrics that fail to capture gradual improvements in smaller models, making progress look sudden when it's actually continuous.
A 2026 paper proposed that LLMs are non-ergodic systems where capabilities emerge through discrete transitions guided by constraint interactions—but this is still a hypothesis, not a proven theory.
4. Why Does Pre-Training on Internet Data Work?
Models are trained on:
- Web pages (Reddit, Wikipedia, blogs)
- Code repositories (GitHub, Stack Overflow)
- Books (fiction, non-fiction, technical manuals)
Somehow, this leads to models that can:
- Diagnose diseases
- Write legal contracts
- Prove mathematical theorems
- Generate novel protein structures
Why does next-token prediction on internet text lead to general reasoning abilities?
As one X commenter put it:
"Tech VCs thought AI was sentient because they had never read books and hence the LLM was beyond anything they had ever seen."
The implication: LLMs might just be extremely good pattern matchers, not true reasoners. But if that's the case—why do they generalize so well?
The Thermodynamics Analogy
Multiple replies to Lu's post invoked a historical analogy:
"We're using AI like we used thermodynamics before we understood atomic theory."
In the 1800s, engineers built steam engines using thermodynamics. They could:
- Predict how much work an engine would produce
- Optimize efficiency
- Build machines that powered the Industrial Revolution
But they didn't understand why thermodynamics worked until the kinetic theory of gases and statistical mechanics explained heat and pressure in terms of molecular motion.
Similarly, AI researchers today can:
- Predict how models will perform (scaling laws)
- Optimize architectures (transformers, MoE, SSMs)
- Build systems that transform industries (ChatGPT, Codex)
But we lack the fundamental theory that explains why transformers work at a mechanistic, first-principles level.
The Mechanistic Interpretability Response
One camp of researchers argues we're making progress toward a theory through mechanistic interpretability.
What is Mechanistic Interpretability?
Mechanistic interpretability reverse-engineers neural networks to understand how AI actually thinks. It aims to uncover how a model computes outputs by analyzing:
- Weights (learned parameters)
- Neuron activations (what fires when)
- Information pathways (how data flows through layers)
Recent Breakthroughs
Anthropic has led this field with several major advances:
1. Feature Dictionary Learning (2024)
Anthropic announced a "microscope" that identified features corresponding to recognizable concepts:
- Neurons that activate for "Golden Gate Bridge"
- Neurons that activate for "code syntax errors"
- Neurons that activate for "sarcasm"
2. Circuit Tracing (2025)
Anthropic traced whole sequences of features and the path a model takes from prompt to response, showing:
- How models compose simpler features into complex concepts
- How attention heads route information
- How models perform multi-step reasoning
3. Targeted Intervention (2026)
Researchers demonstrated selective control of model behavior by:
- Suppressing toxic outputs
- Manipulating semantic content
- Enhancing factual accuracy
Anthropic has stated its goal: "Reliably detect most AI model problems by 2027 using interpretability tools."
The Geometric Foundation
Recent research suggests knowledge is encoded as geometry in high-dimensional space.
Models represent concepts as vectors, and relationships between concepts correspond to geometric relationships (distances, angles, subspaces).
This explains:
- Word analogies (king - man + woman = queen)
- Concept composition (combining features to form new ideas)
- Transfer learning (representations generalize across tasks)
But why does gradient descent on next-token prediction lead to these semantically meaningful geometric structures?
That's still an open question.
The Scaling Law Debate
Another reply thread focused on scaling laws as a partial theory.
What Scaling Laws Tell Us
The OpenAI Scaling Laws paper (2020) and DeepMind's Chinchilla paper (2022) established:
-
Loss decreases as a power law with:
- Model size (N)
- Dataset size (D)
- Compute (C)
-
Optimal allocation of compute requires balancing model size and data:
- Training a 70B model on 1.4T tokens is better than training a 175B model on 300B tokens (same compute budget)
-
Emergent abilities correlate with scale:
- In-context learning improves predictably
- Instruction following emerges at ~10B parameters
- Chain-of-thought reasoning emerges at ~100B parameters
What Scaling Laws Don't Tell Us
Scaling laws are phenomenological: they describe what happens, not why.
They don't explain:
- Why power laws govern AI performance
- Why the exponents have the values they do
- Why emergent abilities appear at specific thresholds
- What changes inside the model as it scales
A 2026 unified framework connected scaling laws to in-context learning emergence, showing that ICL performance follows power-law relationships with model depth, width, context length, and training data—but the exponents are determined by task structure.
This is progress toward theory, but still descriptive, not first-principles.
The "AI Doesn't Actually Work" Argument
Some replies pushed back on the premise, arguing AI doesn't work as well as claimed.
One X commenter wrote:
"I would be way more bullish on AI if it actually worked and was actually replacing real humans at scale. Nothing is changing and we're being lied to. The tools don't work! They are expensive! And high maintenance. Dot com bubble 2.0."
This reflects growing AI skepticism as enterprises face:
- Cost explosions (Microsoft banning Claude Code due to token costs)
- Unreliable outputs (arXiv imposing one-year bans for AI-generated errors)
- Overhyped capabilities (agentic fatigue)
The Counterargument
Others countered:
- AI augmentation is real (Gary Tan's 400x productivity with Claude Code)
- AI job impact is gradual (~11,000 net U.S. jobs lost monthly, not millions)
- AI science is advancing (AlphaFold 3, drug discovery, materials science)
The debate reflects a gap between hype and reality—but also genuine progress amid unrealistic expectations.
The Neuroscience Parallel
Some replies drew parallels to neuroscience, where we:
- Understand how neurons work (action potentials, synapses)
- Can map brain regions (fMRI, EEG)
- Still don't fully understand consciousness, reasoning, or memory
One commenter noted:
"We know how AI works. We don't fully know why it works as well as it does. That's an important distinction."
This mirrors neuroscience:
- We know how neurons fire
- We don't know why consciousness emerges
Similarly:
- We know how transformers compute (matrix multiplications, attention)
- We don't know why they generalize so well
The "Practical Results Matter More" Argument
A pragmatic camp argued: Who cares about theory if it works?
One reply stated:
"The real-world deployments showed augmentation, not replacement. Humans plus AI is the winning formula."
This reflects the engineering mindset: prioritize building useful systems over understanding fundamental mechanisms.
Historically, this has worked:
- Steam engines powered the Industrial Revolution before we understood thermodynamics
- Antibiotics saved millions before we understood molecular biology
- Vaccines worked before we understood immunology at the cellular level
But in each case, theory eventually caught up and enabled:
- Better engines (internal combustion, turbines)
- Better drugs (targeted therapies)
- Better vaccines (mRNA vaccines)
Why Theory Matters for AI
Understanding why AI works isn't just academic—it has practical implications.
1. Safety and Alignment
If we don't understand why models produce certain outputs, we can't:
- Predict when models will fail
- Detect deceptive or harmful behavior
- Align models with human values reliably
Anthropic's interpretability research aims to solve this by 2027, but we're not there yet.
2. Efficiency
Understanding why scaling works could help us:
- Train smaller models that perform as well as larger ones
- Reduce compute costs (currently spiraling out of control)
- Design better architectures (moving beyond transformers)
State Space Models and Mixture of Experts architectures are attempts to move beyond transformers, but they're still empirical experiments, not theory-driven designs.
3. Generalization
Understanding why pre-training on internet data leads to general reasoning could help us:
- Design better training data
- Improve out-of-distribution generalization
- Reduce hallucinations and errors
4. Scientific Discovery
A theoretical understanding could:
- Accelerate AI progress (rather than trial-and-error scaling)
- Unlock new capabilities (analogous to quantum mechanics enabling semiconductors)
- Predict limits (what AI can and can't do)
The Current State of AI Theory
So where are we now?
What We Have
- Scaling laws that predict performance
- Mechanistic interpretability that maps features and circuits
- Emergent abilities that correlate with scale
- Geometric interpretations of knowledge representation
What We're Missing
- A unified theory explaining why transformers work
- First-principles understanding of emergent abilities
- Predictive models of what capabilities will appear at what scale
- Fundamental limits of current architectures
Current Research Directions
2026 breakthroughs include:
- Non-ergodic frameworks for emergence
- Unified scaling law theories connecting ICL and model size
- Circuit-level interventions for targeted control
- Geometric theories of knowledge representation
But none of these constitute a complete, first-principles theory.
What Happens Next?
Three scenarios:
Scenario 1: Theory Catches Up
AI research develops a unified theory (like statistical mechanics for thermodynamics) that explains:
- Why transformers work
- Why scaling laws hold
- How emergent abilities appear
This enables theory-driven AI design and predictable capabilities.
Likelihood: Medium. Mechanistic interpretability is making progress, but we're not close to a unified theory yet.
Scenario 2: Empirical Progress Continues Without Theory
AI systems keep improving through:
- Scaling (bigger models, more data)
- Architectural tweaks (MoE, SSMs, new attention mechanisms)
- Empirical optimization (RLHF, better training techniques)
Theory lags behind, but practical results drive adoption.
Likelihood: High. This is the current trajectory.
Scenario 3: Scaling Hits a Wall
Without theoretical understanding, we:
- Can't predict when scaling will stop working
- Can't design better architectures
- Hit diminishing returns on compute investment
AI progress slows dramatically as empirical scaling plateaus.
Likelihood: Low-Medium. Some evidence of diminishing returns emerging, but not a hard wall yet.
What This Means for Developers
If you're building with AI, here's what Lu's observation means:
1. Expect Unpredictability
Without a theory, you can't fully predict when models will:
- Fail on edge cases
- Hallucinate confidently
- Refuse valid requests
Design systems with human oversight and fallbacks.
2. Empirical Testing is Critical
Since theory can't predict behavior, test extensively:
- Red-teaming
- Adversarial examples
- Real-world deployments
3. Stay Updated on Interpretability
Mechanistic interpretability tools are improving. Follow:
- Anthropic's research (leading the field)
- OpenAI's safety work
- Academic conferences (NeurIPS, ICML)
4. Prepare for Paradigm Shifts
If a theoretical breakthrough happens, it could:
- Obsolete current architectures (like transformers replacing RNNs)
- Unlock new capabilities (like attention unlocked sequence modeling)
- Change cost structures (making current systems cheaper or obsolete)
Related Reading
- Anthropic Natural Language Autoencoders — Latest interpretability research
- Gary Tan's 400x Productivity with Claude Code — AI augmentation in practice
- Sam Altman and Dario Amodei Walk Back AI Jobs Apocalypse — Reality check on AI impact
- Agentic Fatigue and the Productivity Paradox — When AI disappoints
- Scalable Oversight: RLHF, Constitutional AI, Weak-to-Strong — AI alignment techniques
Conclusion
Mandy Lu's statement—"we still have no satisfying theory for why AI works"—is both alarming and accurate.
We've built systems that:
- Pass medical exams
- Write production code
- Reason through novel problems
- Generate photorealistic images
But we can't fully explain why they work.
This isn't just an academic curiosity. Without theory, we:
- Can't predict failures
- Can't guarantee safety
- Can't optimize efficiency
- Can't anticipate limits
We're flying blind on the most powerful technology of the 21st century.
The good news: progress is happening. Mechanistic interpretability, scaling law research, and geometric theories are advancing. Anthropic aims to "reliably detect most AI model problems by 2027."
The bad news: we're not there yet. And in the meantime, billions of dollars and critical decisions depend on systems we don't fully understand.
For developers, the lesson is clear: build with humility. AI is powerful, but unpredictable. Test rigorously. Keep humans in the loop. Stay updated on interpretability research.
And watch for the breakthrough—when it comes, it could change everything.
Sources
- Mandy Lu | Stanford University
- Mandy Lu | Google Scholar
- Mechanistic interpretability: 10 Breakthrough Technologies 2026 | MIT Technology Review
- Mechanistic Interpretability Explained | Taskade
- Emergent Abilities of Large Language Models | arXiv
- Large language models and emergence: a complex systems perspective | Royal Society
- Scaling Laws and In-Context Learning: A Unified Framework | arXiv
- The Scaling Law Imperative: AI Infrastructure Demand | Science & Technology News
- A non-ergodic framework for understanding emergent capabilities in LLMs | arXiv
AI theory and interpretability research evolve rapidly. This analysis reflects the state of knowledge as of June 2026. For the latest research, follow Anthropic Research, OpenAI Research, and leading ML conferences.