← Blog
explainx / blog

Google AI Researcher Sparks Debate: We Still Don't Know Why AI Works So Well

Mandy Lu (Stanford PhD, Google AI) ignites discussion on X by stating 'we still have no satisfying theory for why AI works'—exposing the gap between transformers' empirical success and our theoretical understanding of scaling laws, emergent abilities, and mechanistic interpretability.

15 min readYash Thakker
AI TheoryMechanistic InterpretabilityScaling LawsTransformer ArchitectureAI ResearchGoogle AI

MDX restores the committed source plus an HTML comment attribution; plain text bundles the rendered markdown body with the explainx.ai attribution footer.

Google AI Researcher Sparks Debate: We Still Don't Know Why AI Works So Well

On June 3, 2026, Mandy Lu—a Google AI researcher with a Stanford PhD in computational mathematics—posted a simple statement on X (formerly Twitter) that ignited a viral debate:

"we still have no satisfying theory for why AI works"

The post struck a nerve. With over 80 reactions and hundreds of replies, it exposed an uncomfortable truth in the AI research community:

Despite transformers powering ChatGPT, Codex, and every major AI breakthrough since 2017, no one fully understands why they work so well.

We have scaling laws that predict performance. We have mechanistic interpretability that maps features. We have emergent abilities that appear unpredictably.

But we don't have a unified theory explaining why massive models trained on internet-scale data can reason, code, and create.

It's like building a rocket to the moon using thermodynamics—but without understanding atoms.

TL;DR

TopicKey Facts
The ProblemNo satisfying theory explains why transformers trained on massive datasets perform reasoning, coding, and creative tasks so effectively.
Scaling LawsPredict performance gains from compute, data, and parameters—but not the underlying reasons why scaling works.
Emergent AbilitiesCapabilities (reasoning, in-context learning) appear in large models unpredictably; some debate if emergence is real or measurement artifact.
Mechanistic InterpretabilityReverse-engineers neural networks to map features and pathways; named MIT's 2026 Breakthrough Technology.
Current StateAI works empirically (thermodynamics) but lacks fundamental theory (atomic physics). Practical results prioritized over theory.
Debate ContextSparked by Mandy Lu (Google AI, Stanford PhD) on X; reflects broader tension in AI research community.

Who is Mandy Lu?

Mandy Lu is an AI researcher at Google working on health and climate applications. She holds a PhD in Computational and Mathematical Engineering from Stanford University, admitted Autumn 2025.

Her academic background includes:

  • Dual bachelor's degree in Math and Computer Science from Stanford
  • Master's degree with concentration in AI from Stanford
  • Research in the Stanford Vision Lab advised by Prof. Fei-Fei Li and Prof. Juan Carlos Niebles
  • Work in the Computational Neuroscience Laboratory under Prof. Ehsan Adeli and Prof. Kilian Pohl

Her research projects focus on:

  • Using computer vision to develop systems for Parkinson's disease assessment
  • Developing statistical techniques to reduce the effect of confounding variables on ML models

Her interdisciplinary background in computational mathematics, neuroscience, and AI makes her uniquely positioned to identify gaps between empirical results and theoretical understanding.

The Statement That Sparked the Debate

On June 3, 2026, Lu posted:

"we still have no satisfying theory for why AI works"

The simplicity and directness of the statement resonated across the AI research community, tech workers, and skeptics alike.

Replies ranged from:

  • Technical explanations of scaling laws and mechanistic interpretability
  • Comparisons to thermodynamics before atomic theory
  • Debates over whether practical results matter more than theory
  • Criticisms that AI is overhyped and doesn't actually "work" as claimed

The discussion exposed a fundamental tension in AI research: we can build increasingly powerful systems, but we can't fully explain them.

What We Know: The Empirical Evidence

Before exploring what we don't understand, let's establish what we do know.

1. Transformers Work Empirically

Since the "Attention Is All You Need" paper (Vaswani et al., 2017), transformers have dominated AI:

  • Large Language Models (GPT-4, Claude, Gemini)
  • Vision Models (CLIP, SAM, DALL-E)
  • Multimodal Models (GPT-4o, Gemini 1.5)
  • Code Generation (Codex, GitHub Copilot)
  • Scientific Discovery (AlphaFold 3, protein design)

The architecture works—undeniably and reproducibly.

2. Scaling Laws are Predictable

Research from OpenAI (2020) and DeepMind's Chinchilla (2022) established that model performance follows power-law relationships with:

  • Model size (number of parameters)
  • Dataset size (number of training tokens)
  • Compute (FLOPs during training)

You can predict the performance of a 100B parameter model trained on 2T tokens before you train it.

3. Emergent Abilities Appear

As models scale, new capabilities appear that weren't present in smaller models:

  • In-context learning (learning from examples in the prompt)
  • Chain-of-thought reasoning (step-by-step problem solving)
  • Instruction following (understanding and executing complex requests)
  • Multi-step planning (breaking down tasks)

These abilities emerge at certain scale thresholds—but we can't predict exactly when or why.

4. Mechanistic Interpretability is Advancing

Researchers can now:

  • Identify features corresponding to recognizable concepts (Anthropic's "dictionary learning")
  • Trace pathways a model takes from prompt to response
  • Intervene on specific attention heads to control model behavior
  • Map geometric representations of knowledge in high-dimensional space

MIT Technology Review named mechanistic interpretability one of its 10 Breakthrough Technologies for 2026.

What We Don't Know: The Theory Gap

Despite these advances, fundamental questions remain unanswered.

1. Why Does Self-Attention Work So Well?

The self-attention mechanism is the core of transformers. It allows the model to:

  • Look at every token in a sequence simultaneously
  • Compute relationships between all tokens
  • Parallelize computation across GPUs

We know how it works mathematically. We can implement it. We can optimize it.

But why does this particular mechanism enable reasoning, creativity, and generalization?

2. Why Do Scaling Laws Hold?

Scaling laws are descriptive, not explanatory.

They tell us:

  • "If you 10x the compute, performance improves by Y%"

They don't tell us:

  • "Why does more compute lead to better reasoning?"
  • "What changes in the model's internal structure as it scales?"
  • "Why do power laws govern AI performance?"

As one researcher put it: "Scaling laws are like the ideal gas law. They predict behavior, but they're not a fundamental theory."

3. Why Do Emergent Abilities Appear?

Emergent abilities are the most mysterious phenomenon in AI.

At some scale threshold, models suddenly:

  • Learn to follow multi-step instructions
  • Perform arithmetic they weren't explicitly trained on
  • Reason through novel problems

Why?

Some researchers argue emergence is real—a phase transition in the model's internal representations.

Others argue it's a measurement artifact: we're using crude metrics that fail to capture gradual improvements in smaller models, making progress look sudden when it's actually continuous.

A 2026 paper proposed that LLMs are non-ergodic systems where capabilities emerge through discrete transitions guided by constraint interactions—but this is still a hypothesis, not a proven theory.

4. Why Does Pre-Training on Internet Data Work?

Models are trained on:

  • Web pages (Reddit, Wikipedia, blogs)
  • Code repositories (GitHub, Stack Overflow)
  • Books (fiction, non-fiction, technical manuals)

Somehow, this leads to models that can:

  • Diagnose diseases
  • Write legal contracts
  • Prove mathematical theorems
  • Generate novel protein structures

Why does next-token prediction on internet text lead to general reasoning abilities?

As one X commenter put it:

"Tech VCs thought AI was sentient because they had never read books and hence the LLM was beyond anything they had ever seen."

The implication: LLMs might just be extremely good pattern matchers, not true reasoners. But if that's the case—why do they generalize so well?

The Thermodynamics Analogy

Multiple replies to Lu's post invoked a historical analogy:

"We're using AI like we used thermodynamics before we understood atomic theory."

In the 1800s, engineers built steam engines using thermodynamics. They could:

  • Predict how much work an engine would produce
  • Optimize efficiency
  • Build machines that powered the Industrial Revolution

But they didn't understand why thermodynamics worked until the kinetic theory of gases and statistical mechanics explained heat and pressure in terms of molecular motion.

Similarly, AI researchers today can:

  • Predict how models will perform (scaling laws)
  • Optimize architectures (transformers, MoE, SSMs)
  • Build systems that transform industries (ChatGPT, Codex)

But we lack the fundamental theory that explains why transformers work at a mechanistic, first-principles level.

The Mechanistic Interpretability Response

One camp of researchers argues we're making progress toward a theory through mechanistic interpretability.

What is Mechanistic Interpretability?

Mechanistic interpretability reverse-engineers neural networks to understand how AI actually thinks. It aims to uncover how a model computes outputs by analyzing:

  • Weights (learned parameters)
  • Neuron activations (what fires when)
  • Information pathways (how data flows through layers)

Recent Breakthroughs

Anthropic has led this field with several major advances:

1. Feature Dictionary Learning (2024)

Anthropic announced a "microscope" that identified features corresponding to recognizable concepts:

  • Neurons that activate for "Golden Gate Bridge"
  • Neurons that activate for "code syntax errors"
  • Neurons that activate for "sarcasm"

2. Circuit Tracing (2025)

Anthropic traced whole sequences of features and the path a model takes from prompt to response, showing:

  • How models compose simpler features into complex concepts
  • How attention heads route information
  • How models perform multi-step reasoning

3. Targeted Intervention (2026)

Researchers demonstrated selective control of model behavior by:

  • Suppressing toxic outputs
  • Manipulating semantic content
  • Enhancing factual accuracy

Anthropic has stated its goal: "Reliably detect most AI model problems by 2027 using interpretability tools."

The Geometric Foundation

Recent research suggests knowledge is encoded as geometry in high-dimensional space.

Models represent concepts as vectors, and relationships between concepts correspond to geometric relationships (distances, angles, subspaces).

This explains:

  • Word analogies (king - man + woman = queen)
  • Concept composition (combining features to form new ideas)
  • Transfer learning (representations generalize across tasks)

But why does gradient descent on next-token prediction lead to these semantically meaningful geometric structures?

That's still an open question.

The Scaling Law Debate

Another reply thread focused on scaling laws as a partial theory.

What Scaling Laws Tell Us

The OpenAI Scaling Laws paper (2020) and DeepMind's Chinchilla paper (2022) established:

  1. Loss decreases as a power law with:

    • Model size (N)
    • Dataset size (D)
    • Compute (C)
  2. Optimal allocation of compute requires balancing model size and data:

    • Training a 70B model on 1.4T tokens is better than training a 175B model on 300B tokens (same compute budget)
  3. Emergent abilities correlate with scale:

    • In-context learning improves predictably
    • Instruction following emerges at ~10B parameters
    • Chain-of-thought reasoning emerges at ~100B parameters

What Scaling Laws Don't Tell Us

Scaling laws are phenomenological: they describe what happens, not why.

They don't explain:

  • Why power laws govern AI performance
  • Why the exponents have the values they do
  • Why emergent abilities appear at specific thresholds
  • What changes inside the model as it scales

A 2026 unified framework connected scaling laws to in-context learning emergence, showing that ICL performance follows power-law relationships with model depth, width, context length, and training data—but the exponents are determined by task structure.

This is progress toward theory, but still descriptive, not first-principles.

The "AI Doesn't Actually Work" Argument

Some replies pushed back on the premise, arguing AI doesn't work as well as claimed.

One X commenter wrote:

"I would be way more bullish on AI if it actually worked and was actually replacing real humans at scale. Nothing is changing and we're being lied to. The tools don't work! They are expensive! And high maintenance. Dot com bubble 2.0."

This reflects growing AI skepticism as enterprises face:

The Counterargument

Others countered:

The debate reflects a gap between hype and reality—but also genuine progress amid unrealistic expectations.

The Neuroscience Parallel

Some replies drew parallels to neuroscience, where we:

  • Understand how neurons work (action potentials, synapses)
  • Can map brain regions (fMRI, EEG)
  • Still don't fully understand consciousness, reasoning, or memory

One commenter noted:

"We know how AI works. We don't fully know why it works as well as it does. That's an important distinction."

This mirrors neuroscience:

  • We know how neurons fire
  • We don't know why consciousness emerges

Similarly:

  • We know how transformers compute (matrix multiplications, attention)
  • We don't know why they generalize so well

The "Practical Results Matter More" Argument

A pragmatic camp argued: Who cares about theory if it works?

One reply stated:

"The real-world deployments showed augmentation, not replacement. Humans plus AI is the winning formula."

This reflects the engineering mindset: prioritize building useful systems over understanding fundamental mechanisms.

Historically, this has worked:

  • Steam engines powered the Industrial Revolution before we understood thermodynamics
  • Antibiotics saved millions before we understood molecular biology
  • Vaccines worked before we understood immunology at the cellular level

But in each case, theory eventually caught up and enabled:

  • Better engines (internal combustion, turbines)
  • Better drugs (targeted therapies)
  • Better vaccines (mRNA vaccines)

Why Theory Matters for AI

Understanding why AI works isn't just academic—it has practical implications.

1. Safety and Alignment

If we don't understand why models produce certain outputs, we can't:

  • Predict when models will fail
  • Detect deceptive or harmful behavior
  • Align models with human values reliably

Anthropic's interpretability research aims to solve this by 2027, but we're not there yet.

2. Efficiency

Understanding why scaling works could help us:

  • Train smaller models that perform as well as larger ones
  • Reduce compute costs (currently spiraling out of control)
  • Design better architectures (moving beyond transformers)

State Space Models and Mixture of Experts architectures are attempts to move beyond transformers, but they're still empirical experiments, not theory-driven designs.

3. Generalization

Understanding why pre-training on internet data leads to general reasoning could help us:

  • Design better training data
  • Improve out-of-distribution generalization
  • Reduce hallucinations and errors

4. Scientific Discovery

A theoretical understanding could:

  • Accelerate AI progress (rather than trial-and-error scaling)
  • Unlock new capabilities (analogous to quantum mechanics enabling semiconductors)
  • Predict limits (what AI can and can't do)

The Current State of AI Theory

So where are we now?

What We Have

  1. Scaling laws that predict performance
  2. Mechanistic interpretability that maps features and circuits
  3. Emergent abilities that correlate with scale
  4. Geometric interpretations of knowledge representation

What We're Missing

  1. A unified theory explaining why transformers work
  2. First-principles understanding of emergent abilities
  3. Predictive models of what capabilities will appear at what scale
  4. Fundamental limits of current architectures

Current Research Directions

2026 breakthroughs include:

  • Non-ergodic frameworks for emergence
  • Unified scaling law theories connecting ICL and model size
  • Circuit-level interventions for targeted control
  • Geometric theories of knowledge representation

But none of these constitute a complete, first-principles theory.

What Happens Next?

Three scenarios:

Scenario 1: Theory Catches Up

AI research develops a unified theory (like statistical mechanics for thermodynamics) that explains:

  • Why transformers work
  • Why scaling laws hold
  • How emergent abilities appear

This enables theory-driven AI design and predictable capabilities.

Likelihood: Medium. Mechanistic interpretability is making progress, but we're not close to a unified theory yet.

Scenario 2: Empirical Progress Continues Without Theory

AI systems keep improving through:

  • Scaling (bigger models, more data)
  • Architectural tweaks (MoE, SSMs, new attention mechanisms)
  • Empirical optimization (RLHF, better training techniques)

Theory lags behind, but practical results drive adoption.

Likelihood: High. This is the current trajectory.

Scenario 3: Scaling Hits a Wall

Without theoretical understanding, we:

  • Can't predict when scaling will stop working
  • Can't design better architectures
  • Hit diminishing returns on compute investment

AI progress slows dramatically as empirical scaling plateaus.

Likelihood: Low-Medium. Some evidence of diminishing returns emerging, but not a hard wall yet.

What This Means for Developers

If you're building with AI, here's what Lu's observation means:

1. Expect Unpredictability

Without a theory, you can't fully predict when models will:

  • Fail on edge cases
  • Hallucinate confidently
  • Refuse valid requests

Design systems with human oversight and fallbacks.

2. Empirical Testing is Critical

Since theory can't predict behavior, test extensively:

  • Red-teaming
  • Adversarial examples
  • Real-world deployments

3. Stay Updated on Interpretability

Mechanistic interpretability tools are improving. Follow:

  • Anthropic's research (leading the field)
  • OpenAI's safety work
  • Academic conferences (NeurIPS, ICML)

4. Prepare for Paradigm Shifts

If a theoretical breakthrough happens, it could:

  • Obsolete current architectures (like transformers replacing RNNs)
  • Unlock new capabilities (like attention unlocked sequence modeling)
  • Change cost structures (making current systems cheaper or obsolete)

Related Reading

Conclusion

Mandy Lu's statement—"we still have no satisfying theory for why AI works"—is both alarming and accurate.

We've built systems that:

  • Pass medical exams
  • Write production code
  • Reason through novel problems
  • Generate photorealistic images

But we can't fully explain why they work.

This isn't just an academic curiosity. Without theory, we:

  • Can't predict failures
  • Can't guarantee safety
  • Can't optimize efficiency
  • Can't anticipate limits

We're flying blind on the most powerful technology of the 21st century.

The good news: progress is happening. Mechanistic interpretability, scaling law research, and geometric theories are advancing. Anthropic aims to "reliably detect most AI model problems by 2027."

The bad news: we're not there yet. And in the meantime, billions of dollars and critical decisions depend on systems we don't fully understand.

For developers, the lesson is clear: build with humility. AI is powerful, but unpredictable. Test rigorously. Keep humans in the loop. Stay updated on interpretability research.

And watch for the breakthrough—when it comes, it could change everything.


Sources


AI theory and interpretability research evolve rapidly. This analysis reflects the state of knowledge as of June 2026. For the latest research, follow Anthropic Research, OpenAI Research, and leading ML conferences.

Related posts