Loss Functions

Measuring Error

A loss function (also called a cost function or objective) measures how wrong our network's predictions are. Training reduces this number. The choice of loss function depends on the task.

Mean Squared Error (Regression)

For regression, we want predictions close to continuous target values:

$\mathcal{L}_{\text{MSE}} = \frac{1}{n} \sum_{i=1}^{n} (\hat{y}_i - y_i)^2$

MSE penalizes large errors quadratically — a prediction off by 2 incurs 4× the loss of one off by 1.

Binary Cross-Entropy (Classification)

For binary classification where $\hat{y}_i \in (0,1)$ is a probability:

$\mathcal{L}_{\text{BCE}} = -\frac{1}{n} \sum_{i=1}^{n} \left[ y_i \log \hat{y}_i + (1 - y_i) \log(1 - \hat{y}_i) \right]$

This is derived from maximum likelihood. When $y_i = 1$ , only $\log \hat{y}_i$ matters — we want $\hat{y}_i \to 1$ . When $y_i = 0$ , only $\log(1 - \hat{y}_i)$ matters — we want $\hat{y}_i \to 0$ .

The $\varepsilon = 10^{-15}$ clip prevents $\log(0) = -\infty$ .

Your Task

Implement:

mse(predictions, targets) — mean squared error
binary_cross_entropy(predictions, targets) — binary cross-entropy with $\varepsilon = 10^{-15}$

← Previous Next →

Python runtime loading...

Click "Run" to execute your code.