Probabilistic generative model with structured latent space
A Variational Autoencoder (VAE) is a generative model that learns a distribution over latent variables, enabling smooth interpolation, sampling, and principled uncertainty.
Core Idea
Instead of encoding an input x to a single latent vector, a VAE learns a posterior distribution:
Sampling from this distribution allows the model to generate diverse yet coherent outputs.
Evidence Lower Bound (ELBO)
VAEs are trained by maximizing the ELBO:
- Reconstruction term: encourages accurate decoding of samples
- KL divergence: regularizes the posterior toward the prior p(z) = N(0,I), shaping a smooth latent space
This tradeoff balances fidelity and generalization.
Reparameterization Trick
Directly sampling z from q(z|x) blocks gradients. The solution is to rewrite sampling as:
Randomness is isolated in ε, allowing gradients to flow through μ and σ.
Interactive Visualization
Explore how VAEs encode distributions, sample latents, interpolate, and decode:
VAE vs. Autoencoder
| Autoencoder | Variational Autoencoder |
|---|---|
| Deterministic latent | Probabilistic latent |
| No prior on z | Explicit prior p(z) |
| Poor sampling | Smooth generation |
| Optimizes reconstruction | Optimizes ELBO |
Extensions
- β-VAE – stronger disentanglement via increased KL weight
- VQ-VAE – discrete latent codes via vector quantization
- Conditional VAE – generation conditioned on labels
- Hierarchical VAE – multi-level latent variables
Key Papers
- Auto-Encoding Variational Bayes – Kingma & Welling (2013)
- β-VAE: Learning Basic Visual Concepts – Higgins et al. (2017)
- Neural Discrete Representation Learning – van den Oord et al. (2017)
VAEs form the foundation of many modern generative models, including diffusion model latents and learned priors.