Linear Transformations in Practice
3 of 36Mathematics for Machine Learning
Linear Transformations in Practice
Time to see linear algebra instead of just computing it. In this notebook-style lesson we apply concrete matrices to 2D points and watch what happens geometrically. By the end, you will have a physical feel for rotations, scalings, shears, and projections — the building blocks of every layer in a neural network.
1. Drawing a Shape We Can Watch
We'll create a little house-shape and transform it. Each column of
shape is a 2D point.
import numpy as np
import matplotlib.pyplot as plt
# A closed outline of a house: bottom-left, bottom-right, top-right,
# top-left, roof-peak, back to bottom-left
shape = np.array([
[0, 2, 2, 0, 1, 0], # x-coords
[0, 0, 1, 1, 2, 0], # y-coords
])
def plot_shapes(original, transformed, title):
fig, ax = plt.subplots(figsize=(6, 6))
ax.plot(original[0], original[1], 'b-', alpha=0.4, label='original')
ax.plot(transformed[0], transformed[1], 'r-', linewidth=2, label='transformed')
ax.axhline(0, color='gray', lw=0.5); ax.axvline(0, color='gray', lw=0.5)
ax.set_aspect('equal'); ax.legend(); ax.set_title(title)
ax.set_xlim(-3, 4); ax.set_ylim(-3, 4)
plt.show()
2. Scaling
A diagonal matrix stretches each axis independently. Stretch by 2 and by 0.5:
S = np.array([[2.0, 0.0],
[0.0, 0.5]])
transformed = S @ shape
plot_shapes(shape, transformed, "Scaling: S = diag(2, 0.5)")
The house becomes wider and shorter. This is what a per-feature normalization layer is doing — multiplying each feature by a scalar.
3. Rotation
A rotation by angle (counter-clockwise) is
theta = np.pi / 4 # 45 degrees
R = np.array([[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]])
transformed = R @ shape
plot_shapes(shape, transformed, "Rotation by 45°")
Rotations are orthogonal: . They preserve all lengths and angles — so the house has the same shape, just tilted. Orthogonal matrices are prized in ML weight initialization for exactly this reason: they don't inflate or shrink signal magnitudes.
4. Shear
A shear slides one axis along another:
Sh = np.array([[1.0, 1.0], # x becomes x + y
[0.0, 1.0]])
transformed = Sh @ shape
plot_shapes(shape, transformed, "Horizontal shear")
The rectangle becomes a parallelogram. Shears don't preserve angles but they do preserve area (determinant = 1). They appear in data augmentation for images.
5. Projection onto a Line
Here is a rank-deficient transformation: projecting every point in onto the x-axis:
P = np.array([[1.0, 0.0],
[0.0, 0.0]]) # kills the y-coordinate
transformed = P @ shape
plot_shapes(shape, transformed, "Projection onto x-axis")
print("rank:", np.linalg.matrix_rank(P)) # 1
print("det:", np.linalg.det(P)) # 0
The entire 2D house flattens to a 1D line segment. Information is lost and the transformation cannot be inverted — this is exactly what "rank deficient" means geometrically.
6. Composition: Stacking Transformations
Apply a rotation, then a scaling — in that order. Matrix multiplication composes right-to-left:
M = S @ R # first R, then S
transformed = M @ shape
plot_shapes(shape, transformed, "R then S (read M right to left)")
# Reverse order — different result!
M2 = R @ S # first S, then R
transformed2 = M2 @ shape
plot_shapes(shape, transformed2, "S then R")
The two composed transformations produce visibly different houses. That's the non-commutativity of matrix multiplication made geometric. A neural network is a long chain of such compositions.
7. Eigenvectors: Directions That Don't Rotate
Here's a preview of Lesson 4. Take the shear matrix and look at its eigenvectors:
Sh = np.array([[1.0, 1.0],
[0.0, 1.0]])
vals, vecs = np.linalg.eig(Sh)
print("eigenvalues:", vals) # [1. 1.]
print("eigenvectors:\n", vecs)
# [[1. -1.]
# [0. ... tiny number]]
The x-axis direction () is an eigenvector — applying the shear leaves it unchanged. For every other direction, the shear rotates the vector. Eigenvectors are the stable directions of a transformation. Lesson 4 will develop this properly.