Adam Optimizer
Adaptive learning rates with momentum for deep learning
Backpropagation
The algorithm that enables neural networks to learn by computing gradients efficiently
Batch Normalization
Normalizing layer inputs to accelerate deep network training
Dropout: Regularization for Neural Networks
Randomly dropping units during training to prevent overfitting
Maximum Likelihood Reinforcement Learning (MaxRL)
A recent idea for training models on pass-fail tasks when sampling matters
Policy Gradient Methods
Directly optimizing policies through gradient ascent on expected returns
Proximal Policy Optimization (PPO)
A stable, sample-efficient policy gradient algorithm for reinforcement learning
Stable Marriage Problem
Finding a stable matching with the Gale-Shapley deferred acceptance algorithm