Kolmogorov Complexity and Algorithmic Randomness

Kolmogorov complexity asks a surprisingly intuitive question: how short is the shortest program that can describe a given object? If a string has a short description, it is structured. If it has no shorter description than writing it out directly, it is close to random.

This page is theory-heavy. If you mainly want the ML connection, jump to Applications to AI and then read MDL Tutorial.

A Simple Example

Compare these two strings:

0101010101010101
0110100011101010

The first one has an obvious pattern: “print 01 eight times.” The second might have no noticeably shorter description than listing the bits exactly.

That is the core intuition behind Kolmogorov complexity.

Core Definition

The Kolmogorov complexity $K(x)$ of a string $x$ is the length of the shortest program that outputs $x$ :

K(x) = \min \{|p| : U(p) = x\}

where $U$ is a universal Turing machine and $|p|$ is the program length in bits.

In words: complexity = shortest exact description.

What This Means Intuitively

Repetitive objects have low complexity
Highly structured objects can have low complexity even if they are long
Truly patternless objects have high complexity

This is different from difficulty or usefulness. A random string can have high complexity even though it has no interesting structure.

Key Properties

Most strings are incompressible

Most length- $n$ strings do not have short descriptions:

|\{x : |x| = n, K(x) < n - c\}| < 2^{n-c}

That formalizes the idea that randomness is common and compression is special.

Complexity is machine-independent up to a constant

If you switch from one universal machine to another, the exact number changes only by a fixed constant:

|K_U(x) - K_V(x)| \leq c_{UV}

So the concept is robust even though the exact encoding details vary.

Interactive Demo

Explore how different strings have different Kolmogorov complexities:

Kolmogorov Complexity

String:

0000000000000000

Shortest Program

print("0" * 16)

Complexity

low

K(x) ≈ O(log n)

Definition

K(x) = min { |p| : U(p) = x }

The Kolmogorov complexity of a string x is the length of the shortest program p that outputs x on a universal Turing machine U.

Incompressible Strings

Most strings have K(x) ≈ |x|. They're random—no pattern to exploit.

Uncomputability

K(x) is uncomputable! No algorithm can find the shortest program for all x.

Algorithmic Randomness

A string is algorithmically random if it cannot be compressed much:

K(x) \geq |x| - O(1)

That does not mean “generated by a random process” in the everyday sense. It means “has no much shorter exact description.”

Why We Cannot Compute It Exactly

One of the deepest results is that Kolmogorov complexity is uncomputable.

There is no general algorithm that takes any string and always returns the true shortest program for it. This is closely related to the halting problem.

Conditional Complexity

Sometimes you already know another object $y$ . Then the relevant question is how many extra bits are needed to describe $x$ given $y$ :

K(x|y) = \min \{|p| : U(p, y) = x\}

This leads to an information-style view of shared structure.

Applications to AI

Kolmogorov complexity motivates several important ideas:

MDL Tutorial: prefer models that compress the data well
MDL Weights: connect description length to regularization
Machine Super Intelligence: use algorithmic ideas in idealized intelligence
Occam’s razor: simpler explanations are preferred because they need shorter descriptions

Common Confusion

High complexity does not mean meaningful structure
Randomness can increase complexity
The quantity is theoretically powerful even though it is not exactly computable

The Chain Rule

One useful identity is:

K(x, y) = K(x) + K(y|x) + O(\log K(x, y))

This says the complexity of a pair is roughly:

describe the first object
then describe the second using the first as context

What To Remember

Kolmogorov complexity measures the shortest exact description
Compression and randomness are two sides of the same idea
The concept is foundational for MDL, algorithmic information theory, and theoretical AI

Key Resource

Kolmogorov Complexity and Algorithmic Randomness - Shen, Uspensky, Vereshchagin
https://www.lirmm.fr/~ashen/kolmbook-eng-scan.pdf