2 pages
The algorithm that enables neural networks to learn by computing gradients efficiently
Normalizing each example across its features