Neural networks augmented with external memory and attention-based read/write heads
Neural Turing Machines (NTMs) augment neural networks with external memory, allowing them to learn algorithms that require explicit storage and retrieval. They represent a step toward neural networks that can reason like computers.
Motivation
Standard neural networks have limited memory capacity—stored implicitly in weights and activations. Computers, by contrast, have explicit addressable memory. NTMs bridge this gap.
Architecture
An NTM consists of:
- Controller: Neural network (LSTM or feedforward)
- Memory Bank: matrix of memory locations
- Read Head: Retrieves information from memory
- Write Head: Modifies memory contents
Reading from Memory
The read head produces attention weights over memory locations:
The read vector is a weighted sum of memory rows.
Writing to Memory
Writing combines erase and add operations:
The erase vector clears, the add vector writes new content.
Addressing Mechanisms
NTMs use a sophisticated attention mechanism with four stages:
1. Content Addressing
Compare a key vector to memory rows using cosine similarity:
2. Interpolation
Blend content-based weights with previous weights:
3. Convolutional Shift
Allow location-based addressing via circular convolution.
4. Sharpening
Focus attention with a sharpening parameter .
Interactive Demo
Explore memory read/write operations:
Neural Turing Machine
What NTMs Can Learn
The paper demonstrated learning of:
- Copy: Reproduce an input sequence
- Associative Recall: Retrieve values by key
- N-Gram modeling: Track recent symbols
- Priority Sort: Sort by priority values
Legacy
NTMs pioneered ideas now central to modern AI:
- External memory augmentation
- Differentiable attention mechanisms
- Content-based retrieval
These concepts evolved into Memory Networks, Differentiable Neural Computers (DNC), and influenced Transformer attention.
Key Paper
- Neural Turing Machines — Graves, Wayne, Danihelka (2014)
https://arxiv.org/abs/1410.5401