Compare ReLU, Sigmoid, Tanh, and GELU — understand why non-linearity is essential for neural networks.
Why Activation Functions?