Lesson 12 of 15
Activation Functions
Activation Functions
Activation functions introduce non-linearity into neural networks. Without them, stacking linear layers is equivalent to a single linear layer — the network could not learn complex patterns.
ReLU
Rectified Linear Unit — the most widely used activation in modern deep learning:
Cheap to compute and avoids the vanishing gradient problem.
Leaky ReLU
Fixes the "dying ReLU" problem by allowing a small gradient for negative inputs:
where is a small constant (default ).
Tanh
The hyperbolic tangent squashes inputs to :
Softmax
Used in the output layer for multi-class classification. Converts a vector of logits into a probability distribution:
All outputs are in and sum to 1.
Your Task
Implement:
relu(x)→leaky_relu(x, alpha=0.01)tanh_activation(x)→ computed viasoftmax(x)→ list of probabilities
Python runtime loading...
Loading...
Click "Run" to execute your code.