Lesson 8 of 15

Multi-Layer Backpropagation

Backprop Through Two Layers

Now extend backpropagation to a two-layer network. The architecture:

z1=w1x+b1a1=ReLU(z1)z_1 = w_1 x + b_1 \qquad a_1 = \text{ReLU}(z_1)

z2=w2a1+b2a2=σ(z2)z_2 = w_2 a_1 + b_2 \qquad a_2 = \sigma(z_2)

L=(a2y)2\mathcal{L} = (a_2 - y)^2

Forward Pass

Compute z1a1z2a2Lz_1 \to a_1 \to z_2 \to a_2 \to \mathcal{L} in order.

Backward Pass

Work backwards using the chain rule:

Lw2=La2a2z2a1\frac{\partial \mathcal{L}}{\partial w_2} = \frac{\partial \mathcal{L}}{\partial a_2} \cdot \frac{\partial a_2}{\partial z_2} \cdot a_1

Lb2=La2a2z2\frac{\partial \mathcal{L}}{\partial b_2} = \frac{\partial \mathcal{L}}{\partial a_2} \cdot \frac{\partial a_2}{\partial z_2}

To reach layer 1, we propagate the error signal through w2w_2:

La1=Lz2w2\frac{\partial \mathcal{L}}{\partial a_1} = \frac{\partial \mathcal{L}}{\partial z_2} \cdot w_2

Lw1=La1ReLU(z1)x\frac{\partial \mathcal{L}}{\partial w_1} = \frac{\partial \mathcal{L}}{\partial a_1} \cdot \text{ReLU}'(z_1) \cdot x

Lb1=La1ReLU(z1)\frac{\partial \mathcal{L}}{\partial b_1} = \frac{\partial \mathcal{L}}{\partial a_1} \cdot \text{ReLU}'(z_1)

This is the backpropagation algorithm: forward to compute activations, backward to propagate error signals.

Your Task

Implement two_layer_backward(x, w1, b1, w2, b2, y) returning (dw1, db1, dw2, db2).

Python runtime loading...
Loading...
Click "Run" to execute your code.