Step-through animations, made with Manim.

Each one is a small film: a single idea, frame by frame. Most have a step-through viewer so you can stop, read the caption, and move on.

All Optimization Neural nets Linear algebra Calculus Probability Activations Backprop Transformers

№ 01Featured

Step-through

optimization · beginner

Gradient Descent Visualization

Watch how gradient descent iteratively finds the minimum of a loss function by following the steepest downhill direction.

45s

Step-through

neural nets · beginner

Neural Network Forward Pass

See how input data flows through a neural network layer by layer, with activations lighting up as data is transformed.

40s

Step-through

linear algebra · beginner

Matrix Transformations

Visualize how matrices transform 2D space — stretching, rotating, and shearing the coordinate grid.

35s

Step-through

backpropagation · intermediate

Backpropagation Explained

The chain rule in action — watch gradients flow backward through a computation graph to update weights during training.

50s

№ 02All animations · 6

Step-through

optimization · beginner

Gradient Descent Visualization

Watch how gradient descent iteratively finds the minimum of a loss function by following the steepest downhill direction.

45s

Step-through

neural nets · beginner

Neural Network Forward Pass

See how input data flows through a neural network layer by layer, with activations lighting up as data is transformed.

40s

Step-through

linear algebra · beginner

Matrix Transformations

Visualize how matrices transform 2D space — stretching, rotating, and shearing the coordinate grid.

35s

Step-through

backpropagation · intermediate

Backpropagation Explained

The chain rule in action — watch gradients flow backward through a computation graph to update weights during training.

50s

Step-through

activations · beginner

Activation Functions Compared

Compare ReLU, Sigmoid, Tanh, and GELU — understand why non-linearity is essential for neural networks.

35s

Step-through

transformers · advanced

Self-Attention in Transformers

A detailed walkthrough of scaled dot-product self-attention, including Q/K/V projections, score scaling, softmax weights, and multi-head intuition.

120s