Introduction

Why Neural Networks from Scratch?

The Machine Learning course taught you perceptrons and forward passes. MicroGPT jumps straight to transformers. This course bridges the gap — implementing backpropagation and multi-layer networks from the ground up, giving you the foundation every modern deep learning system rests on.

Understanding backprop at the equation level transforms neural networks from black boxes into transparent computations. When you can derive $\frac{\partial \mathcal{L}}{\partial w}$ by hand and implement it in Python, every deep learning paper becomes readable.

How This Course Works

Each lesson introduces one concept with its mathematical derivation, then asks you to implement it in pure Python — no NumPy, no PyTorch. Tests verify your implementation against exact expected outputs.

What You Will Build

Foundations — Neurons, activation functions (sigmoid, ReLU, tanh), and loss functions (MSE, binary cross-entropy)
Gradients — Numerical gradient estimation, activation derivatives, and single-layer backpropagation
Backpropagation — Multi-layer backprop, dense layers, and network forward passes
Training — Gradient descent updates, the training loop, Xavier initialization, and L2 regularization
Advanced Techniques — The Adam optimizer and a two-layer network that solves XOR

By the end, you will have implemented every component needed to train a real neural network.