The Dense Layer

Vectorizing to Multiple Neurons

A single neuron maps $\mathbb{R}^n \to \mathbb{R}$ . A dense layer (fully connected layer) stacks $m$ neurons in parallel, mapping $\mathbb{R}^n \to \mathbb{R}^m$ :

$\mathbf{z} = W\mathbf{x} + \mathbf{b}$

$W \in \mathbb{R}^{m \times n}$ — weight matrix, row $j$ are the weights of neuron $j$
$\mathbf{b} \in \mathbb{R}^m$ — bias vector
$\mathbf{x} \in \mathbb{R}^n$ — input vector
$\mathbf{z} \in \mathbb{R}^m$ — pre-activation outputs

Each output $z_j = \sum_{k=1}^n W_{jk} x_k + b_j$ is an independent neuron computation.

DenseLayer Class

We represent a layer as an object with:

weights: list of $m$ rows, each of length $n$
biases: list of $m$ scalars
forward(inputs): computes $W\mathbf{x} + \mathbf{b}$

We initialize with small Gaussian weights (standard deviation 0.1) and zero biases. Large initial weights cause saturated activations that kill gradients.

Your Task

Implement the DenseLayer class with:

__init__(self, in_features, out_features) — initialize weights with random.gauss(0, 0.1) (set random.seed(42) before the loop) and zero biases
forward(self, inputs) — compute and return the output vector

← Previous Next →

Python runtime loading...

Click "Run" to execute your code.