Shane Legg's PhD thesis formalizing universal intelligence and the AIXI agent
Machine Super Intelligence is Shane Legg’s PhD thesis that provides a rigorous mathematical definition of intelligence and proves that the AIXI agent is optimally intelligent.
Defining Intelligence
Legg proposes a formal definition:
where:
- is an agent (policy)
- is an environment
- is Kolmogorov complexity of
- is the expected reward of in
Intelligence is average performance across all computable environments, weighted by simplicity.
The AIXI Agent
AIXI is the theoretically optimal agent:
At each step, AIXI:
- Considers all possible environments (weighted by complexity)
- Computes expected reward for each action
- Chooses the action maximizing expected future reward
Interactive Demo
Explore the key concepts from the thesis:
Machine Super Intelligence
Solomonoff Induction
The prediction component of AIXI uses Solomonoff’s universal prior:
The probability of observing is the sum over all programs that output , weighted by their brevity.
Key Results
Theorem (Optimality): AIXI is the most intelligent agent:
No other agent achieves higher expected performance across all environments.
Theorem (Incomputability): AIXI cannot be computed:
The universal prior requires solving the halting problem. Real systems must approximate.
The Compression-Intelligence Connection
A key insight: compression and prediction are equivalent.
A good predictor is a good compressor, and vice versa. This connects AIXI to practical language models.
Practical Approximations
Real AI systems approximate AIXI through:
- Bounded computation: Limited search depth
- Finite environments: Specific domain knowledge
- Learned priors: Neural networks instead of Solomonoff
Modern LLMs can be viewed as crude AIXI approximations trained on text.
Why Ilya Included This
This thesis provides:
- Theoretical grounding: Rigorous definition of intelligence
- Ultimate benchmark: AIXI as the theoretical ceiling
- Design principles: Compression, prediction, and universality
Understanding the theoretical optimum illuminates what practical systems are approximating.
Implications
- Intelligence can be formalized mathematically
- Optimal intelligence requires universal prediction
- Real AI must make tractability/optimality tradeoffs
- Scaling leads toward AIXI-like behavior
Key Resource
- Machine Super Intelligence — Shane Legg (PhD Thesis, 2008)
https://www.vetta.org/documents/Machine_Super_Intelligence.pdf