Kolmogorov Complexity and Algorithmic Randomness

Measuring how short the best description of an object can be

Kolmogorov complexity asks a surprisingly intuitive question: how short is the shortest program that can describe a given object? If a string has a short description, it is structured. If it has no shorter description than writing it out directly, it is close to random.

This page is theory-heavy. If you mainly want the ML connection, jump to Applications to AI and then read MDL Tutorial.

A Simple Example

Compare these two strings:

  • 0101010101010101
  • 0110100011101010

The first one has an obvious pattern: “print 01 eight times.” The second might have no noticeably shorter description than listing the bits exactly.

That is the core intuition behind Kolmogorov complexity.

Core Definition

The Kolmogorov complexity K(x)K(x) of a string xx is the length of the shortest program that outputs xx:

K(x)=min{p:U(p)=x}K(x) = \min \{|p| : U(p) = x\}

where UU is a universal Turing machine and p|p| is the program length in bits.

In words: complexity = shortest exact description.

What This Means Intuitively

  • Repetitive objects have low complexity
  • Highly structured objects can have low complexity even if they are long
  • Truly patternless objects have high complexity

This is different from difficulty or usefulness. A random string can have high complexity even though it has no interesting structure.

Key Properties

Most strings are incompressible

Most length-nn strings do not have short descriptions:

{x:x=n,K(x)<nc}<2nc|\{x : |x| = n, K(x) < n - c\}| < 2^{n-c}

That formalizes the idea that randomness is common and compression is special.

Complexity is machine-independent up to a constant

If you switch from one universal machine to another, the exact number changes only by a fixed constant:

KU(x)KV(x)cUV|K_U(x) - K_V(x)| \leq c_{UV}

So the concept is robust even though the exact encoding details vary.

Interactive Demo

Explore how different strings have different Kolmogorov complexities:

Kolmogorov Complexity

String:
0000000000000000
Shortest Program
print("0" * 16)
Complexity
low
K(x) ≈ O(log n)
Definition
K(x) = min { |p| : U(p) = x }
The Kolmogorov complexity of a string x is the length of the shortest program p that outputs x on a universal Turing machine U.
Incompressible Strings
Most strings have K(x) ≈ |x|. They're random—no pattern to exploit.
Uncomputability
K(x) is uncomputable! No algorithm can find the shortest program for all x.

Algorithmic Randomness

A string is algorithmically random if it cannot be compressed much:

K(x)xO(1)K(x) \geq |x| - O(1)

That does not mean “generated by a random process” in the everyday sense. It means “has no much shorter exact description.”

Why We Cannot Compute It Exactly

One of the deepest results is that Kolmogorov complexity is uncomputable.

There is no general algorithm that takes any string and always returns the true shortest program for it. This is closely related to the halting problem.

Conditional Complexity

Sometimes you already know another object yy. Then the relevant question is how many extra bits are needed to describe xx given yy:

K(xy)=min{p:U(p,y)=x}K(x|y) = \min \{|p| : U(p, y) = x\}

This leads to an information-style view of shared structure.

Applications to AI

Kolmogorov complexity motivates several important ideas:

  • MDL Tutorial: prefer models that compress the data well
  • MDL Weights: connect description length to regularization
  • Machine Super Intelligence: use algorithmic ideas in idealized intelligence
  • Occam’s razor: simpler explanations are preferred because they need shorter descriptions

Common Confusion

  • High complexity does not mean meaningful structure
  • Randomness can increase complexity
  • The quantity is theoretically powerful even though it is not exactly computable

The Chain Rule

One useful identity is:

K(x,y)=K(x)+K(yx)+O(logK(x,y))K(x, y) = K(x) + K(y|x) + O(\log K(x, y))

This says the complexity of a pair is roughly:

  • describe the first object
  • then describe the second using the first as context

What To Remember

  • Kolmogorov complexity measures the shortest exact description
  • Compression and randomness are two sides of the same idea
  • The concept is foundational for MDL, algorithmic information theory, and theoretical AI

Key Resource

Found an error or want to contribute? Edit on GitHub