Kolmogorov Complexity and Algorithmic Randomness

The mathematical foundation for measuring information content and randomness

Kolmogorov Complexity and Algorithmic Randomness (by Shen, Uspensky, and Vereshchagin) is the definitive textbook on algorithmic information theory—the study of complexity, randomness, and information from a computational perspective.

Core Definition

The Kolmogorov complexity K(x)K(x) of a string xx is the length of the shortest program that outputs xx:

K(x)=min{p:U(p)=x}K(x) = \min \{|p| : U(p) = x\}

where UU is a universal Turing machine and p|p| is the program length in bits.

Key Properties

Invariance Theorem: Choice of universal machine only affects K(x)K(x) by a constant:

KU(x)KV(x)cUV|K_U(x) - K_V(x)| \leq c_{UV}

Upper bound: Every string has complexity at most its length:

K(x)x+O(1)K(x) \leq |x| + O(1)

Incompressibility: Most strings are incompressible:

{x:x=n,K(x)<nc}<2nc|\{x : |x| = n, K(x) < n - c\}| < 2^{n-c}

Interactive Demo

Explore how different strings have different Kolmogorov complexities:

Kolmogorov Complexity

String:
0000000000000000
Shortest Program
print("0" * 16)
Complexity
low
K(x) ≈ O(log n)
Definition
K(x) = min { |p| : U(p) = x }
The Kolmogorov complexity of a string x is the length of the shortest program p that outputs x on a universal Turing machine U.
Incompressible Strings
Most strings have K(x) ≈ |x|. They're random—no pattern to exploit.
Uncomputability
K(x) is uncomputable! No algorithm can find the shortest program for all x.

Algorithmic Randomness

A string is algorithmically random if it’s incompressible:

K(x)xO(1)K(x) \geq |x| - O(1)

Random strings have no exploitable patterns—they can’t be compressed because any shorter description would be the pattern itself.

Conditional Complexity

Given extra information yy:

K(xy)=min{p:U(p,y)=x}K(x|y) = \min \{|p| : U(p, y) = x\}

The mutual information between xx and yy:

I(x:y)=K(x)K(xy)I(x:y) = K(x) - K(x|y)

Uncomputability

Fundamental result: K(x)K(x) is uncomputable.

No algorithm can determine the shortest program for every string. This connects to Gödel’s incompleteness and the halting problem.

Applications to AI

Kolmogorov complexity provides foundations for:

  • Minimum Description Length (MDL): Model selection by compression
  • Solomonoff Induction: Universal prior for prediction
  • AIXI: Theoretical optimal agent framework
  • Occam’s Razor: Formalized as preferring low-KK hypotheses

The Chain Rule

K(x,y)=K(x)+K(yx)+O(logK(x,y))K(x, y) = K(x) + K(y|x) + O(\log K(x, y))

Joint complexity equals one string plus the other given the first.

Sophistication vs Complexity

Kolmogorov complexity: Total information in a string
Sophistication: Structure/pattern complexity (separating regularity from randomness)

A random string has high complexity but low sophistication—it’s complex only because it’s random, not because it has intricate structure.

Why Ilya Included This

This book provides the theoretical foundation for understanding:

  • Why simpler models generalize better (MDL principle)
  • The nature of learning and compression
  • Information-theoretic limits on prediction

These concepts underpin much of modern machine learning theory.

Key Resource

Found an error or want to contribute? Edit on GitHub