Multi-Layer Backpropagation

Backprop Through Two Layers

Now extend backpropagation to a two-layer network. The architecture:

$z_1 = w_1 x + b_1 \qquad a_1 = \text{ReLU}(z_1)$

$z_2 = w_2 a_1 + b_2 \qquad a_2 = \sigma(z_2)$

$\mathcal{L} = (a_2 - y)^2$

Compute $z_1 \to a_1 \to z_2 \to a_2 \to \mathcal{L}$ in order.

Work backwards using the chain rule:

$\frac{\partial \mathcal{L}}{\partial w_2} = \frac{\partial \mathcal{L}}{\partial a_2} \cdot \frac{\partial a_2}{\partial z_2} \cdot a_1$

$\frac{\partial \mathcal{L}}{\partial b_2} = \frac{\partial \mathcal{L}}{\partial a_2} \cdot \frac{\partial a_2}{\partial z_2}$

To reach layer 1, we propagate the error signal through $w_2$ :

$\frac{\partial \mathcal{L}}{\partial a_1} = \frac{\partial \mathcal{L}}{\partial z_2} \cdot w_2$

$\frac{\partial \mathcal{L}}{\partial w_1} = \frac{\partial \mathcal{L}}{\partial a_1} \cdot \text{ReLU}'(z_1) \cdot x$

$\frac{\partial \mathcal{L}}{\partial b_1} = \frac{\partial \mathcal{L}}{\partial a_1} \cdot \text{ReLU}'(z_1)$

This is the backpropagation algorithm: forward to compute activations, backward to propagate error signals.

Implement two_layer_backward(x, w1, b1, w2, b2, y) returning (dw1, db1, dw2, db2).

Python runtime loading...

Click "Run" to execute your code.