Activation Functions

Activation functions introduce non-linearity into neural networks. Without them, stacking linear layers is equivalent to a single linear layer — the network could not learn complex patterns.

ReLU

Rectified Linear Unit — the most widely used activation in modern deep learning:

$\text{ReLU}(x) = \max(0, x)$

Cheap to compute and avoids the vanishing gradient problem.

Leaky ReLU

Fixes the "dying ReLU" problem by allowing a small gradient for negative inputs:

$\text{LeakyReLU}(x) = \begin{cases} x & x > 0 \\ \alpha x & x \leq 0 \end{cases}$

where $\alpha$ is a small constant (default $0.01$ ).

Tanh

The hyperbolic tangent squashes inputs to $(-1, 1)$ :

$\tanh(x) = \frac{e^{2x} - 1}{e^{2x} + 1}$

Softmax

Used in the output layer for multi-class classification. Converts a vector of logits into a probability distribution:

$\text{softmax}(x)_i = \frac{e^{x_i}}{\sum_j e^{x_j}}$

All outputs are in $(0,1)$ and sum to 1.

Your Task

Implement:

relu(x) → $\max(0, x)$
leaky_relu(x, alpha=0.01)
tanh_activation(x) → computed via $(e^{2x}-1)/(e^{2x}+1)$
softmax(x) → list of probabilities

← Previous Next →

Python runtime loading...

Click "Run" to execute your code.