Understanding VAEs as compression systems with a rate-distortion trade-off
Variational Lossy Autoencoder (VLAE) is a way of looking at VAEs through the lens of compression. Instead of asking only “can the model reconstruct the input?”, it asks “how many bits should the latent code use, and what reconstruction quality do we get in return?”
Read VAE first if you are new to the topic. This page is best understood as a second pass that explains what the VAE objective means.
A Compression Analogy
Think about compressing an image before sending it over a network.
- If you keep many details, the file is large but reconstruction is accurate
- If you compress aggressively, the file is small but details are lost
VLAE says a VAE is solving that same trade-off in learned form.
VAEs as Compressors
A VAE has two main parts:
- Encoder : turn the input into a latent code
- Decoder : reconstruct the input from that code
This suggests two competing goals:
- Rate: how many bits are needed to describe the latent
- Distortion: how much reconstruction error we allow
The Rate-Distortion Objective
The VAE objective can be written as:
The first term rewards accurate reconstruction. The second term discourages the model from storing too much information in the latent code.
Interactive Demo
Explore the rate-distortion trade-off and bits-back coding:
Variational Lossy Autoencoder
Why This View Is Useful
Once you think of a VAE as a compressor, several behaviors make more sense:
- a stronger decoder can sometimes ignore the latent
- changing changes how much information the latent is allowed to carry
- “posterior collapse” becomes easier to interpret
Bits-Back Coding
The famous bits-back argument says the KL term can overstate the true effective cost of the latent because stochastic decoding can recover some information “for free.”
That is the technical idea behind why VAEs connect so naturally to information theory and compression.
For a first pass, the most important takeaway is simpler: the latent cost is not just a regularizer; it is part of a coding trade-off.
Posterior Collapse
If the decoder becomes so strong that it can reconstruct well without using the latent, then the model may push:
In that case, the latent carries very little information. This is called posterior collapse.
Common Fixes
| Method | Goal |
|---|---|
| KL annealing | Let reconstruction stabilize before strongly penalizing rate |
| Free bits | Reserve a minimum amount of information in the latent |
| -VAE | Directly control the rate-distortion trade-off |
What To Remember
- VLAE interprets VAEs as lossy compression systems
- The KL term is about information budget, not just regularization
- Posterior collapse means the model has stopped using the latent effectively
Key Paper
- Variational Lossy Autoencoder - Chen et al. (2016)
https://arxiv.org/abs/1611.02731