Activation Functions Compared

Compare ReLU, Sigmoid, Tanh, and GELU — understand why non-linearity is essential for neural networks.

beginner35s4 frames · step through

Frame 1 of 4

\text{Without: } f(x) = W_2(W_1 x) = (W_2 W_1)x = W'x

Why Activation Functions?