Lesson 7 of 15

The Dense Layer

Vectorizing to Multiple Neurons

A single neuron maps RnR\mathbb{R}^n \to \mathbb{R}. A dense layer (fully connected layer) stacks mm neurons in parallel, mapping RnRm\mathbb{R}^n \to \mathbb{R}^m:

z=Wx+b\mathbf{z} = W\mathbf{x} + \mathbf{b}

  • WRm×nW \in \mathbb{R}^{m \times n} — weight matrix, row jj are the weights of neuron jj
  • bRm\mathbf{b} \in \mathbb{R}^m — bias vector
  • xRn\mathbf{x} \in \mathbb{R}^n — input vector
  • zRm\mathbf{z} \in \mathbb{R}^m — pre-activation outputs

Each output zj=k=1nWjkxk+bjz_j = \sum_{k=1}^n W_{jk} x_k + b_j is an independent neuron computation.

DenseLayer Class

We represent a layer as an object with:

  • weights: list of mm rows, each of length nn
  • biases: list of mm scalars
  • forward(inputs): computes Wx+bW\mathbf{x} + \mathbf{b}

We initialize with small Gaussian weights (standard deviation 0.1) and zero biases. Large initial weights cause saturated activations that kill gradients.

Your Task

Implement the DenseLayer class with:

  • __init__(self, in_features, out_features) — initialize weights with random.gauss(0, 0.1) (set random.seed(42) before the loop) and zero biases
  • forward(self, inputs) — compute and return the output vector
Python runtime loading...
Loading...
Click "Run" to execute your code.