The libraryMathematics for Machine Learning

Linear Transformations in Practice

45 min readnotebookLinear Algebra Foundations

3 of 36Mathematics for Machine Learning

Linear Transformations in Practice

Time to see linear algebra instead of just computing it. In this notebook-style lesson we apply concrete matrices to 2D points and watch what happens geometrically. By the end, you will have a physical feel for rotations, scalings, shears, and projections — the building blocks of every layer in a neural network.

1. Drawing a Shape We Can Watch

We'll create a little house-shape and transform it. Each column of shape is a 2D point.

code

import numpy as np
import matplotlib.pyplot as plt

# A closed outline of a house: bottom-left, bottom-right, top-right,
# top-left, roof-peak, back to bottom-left
shape = np.array([
    [0, 2, 2, 0, 1, 0],   # x-coords
    [0, 0, 1, 1, 2, 0],   # y-coords
])

def plot_shapes(original, transformed, title):
    fig, ax = plt.subplots(figsize=(6, 6))
    ax.plot(original[0],    original[1],    'b-',  alpha=0.4, label='original')
    ax.plot(transformed[0], transformed[1], 'r-',  linewidth=2, label='transformed')
    ax.axhline(0, color='gray', lw=0.5); ax.axvline(0, color='gray', lw=0.5)
    ax.set_aspect('equal'); ax.legend(); ax.set_title(title)
    ax.set_xlim(-3, 4); ax.set_ylim(-3, 4)
    plt.show()

2. Scaling

A diagonal matrix stretches each axis independently. Stretch $x$ by 2 and $y$ by 0.5:

code

S = np.array([[2.0, 0.0],
              [0.0, 0.5]])

transformed = S @ shape
plot_shapes(shape, transformed, "Scaling: S = diag(2, 0.5)")

The house becomes wider and shorter. This is what a per-feature normalization layer is doing — multiplying each feature by a scalar.

3. Rotation

A rotation by angle $θ$ (counter-clockwise) is

R (θ) = [cos θ sin θ am p; - sin θ am p; cos θ]

code

theta = np.pi / 4  # 45 degrees
R = np.array([[np.cos(theta), -np.sin(theta)],
              [np.sin(theta),  np.cos(theta)]])

transformed = R @ shape
plot_shapes(shape, transformed, "Rotation by 45°")

Rotations are orthogonal: $R^{T} R = I$ . They preserve all lengths and angles — so the house has the same shape, just tilted. Orthogonal matrices are prized in ML weight initialization for exactly this reason: they don't inflate or shrink signal magnitudes.

4. Shear

A shear slides one axis along another:

code

Sh = np.array([[1.0, 1.0],    # x becomes x + y
               [0.0, 1.0]])

transformed = Sh @ shape
plot_shapes(shape, transformed, "Horizontal shear")

The rectangle becomes a parallelogram. Shears don't preserve angles but they do preserve area (determinant = 1). They appear in data augmentation for images.

5. Projection onto a Line

Here is a rank-deficient transformation: projecting every point in $R^{2}$ onto the x-axis:

code

P = np.array([[1.0, 0.0],
              [0.0, 0.0]])      # kills the y-coordinate

transformed = P @ shape
plot_shapes(shape, transformed, "Projection onto x-axis")
print("rank:", np.linalg.matrix_rank(P))     # 1
print("det:", np.linalg.det(P))               # 0

The entire 2D house flattens to a 1D line segment. Information is lost and the transformation cannot be inverted — this is exactly what "rank deficient" means geometrically.

6. Composition: Stacking Transformations

Apply a rotation, then a scaling — in that order. Matrix multiplication composes right-to-left:

code

M = S @ R      # first R, then S
transformed = M @ shape
plot_shapes(shape, transformed, "R then S (read M right to left)")

# Reverse order — different result!
M2 = R @ S     # first S, then R
transformed2 = M2 @ shape
plot_shapes(shape, transformed2, "S then R")

The two composed transformations produce visibly different houses. That's the non-commutativity of matrix multiplication made geometric. A neural network is a long chain of such compositions.

7. Eigenvectors: Directions That Don't Rotate

Here's a preview of Lesson 4. Take the shear matrix $S h$ and look at its eigenvectors:

code

Sh = np.array([[1.0, 1.0],
               [0.0, 1.0]])

vals, vecs = np.linalg.eig(Sh)
print("eigenvalues:", vals)   # [1. 1.]
print("eigenvectors:\n", vecs)
# [[1. -1.]
#  [0.  ... tiny number]]

The x-axis direction ( $[1, 0]^{T}$ ) is an eigenvector — applying the shear leaves it unchanged. For every other direction, the shear rotates the vector. Eigenvectors are the stable directions of a transformation. Lesson 4 will develop this properly.

8. The Exercise

← Previous lessonMatrix Operations and Properties

Up next · Eigenvalues and Eigenvectors