Gradient Descent

Gradient descent is the most widely used optimization algorithm in machine learning. It minimizes a function by repeatedly stepping in the direction of the negative gradient (steepest descent):

$x_{n+1} = x_n - \alpha \cdot f'(x_n)$

where $\alpha$ is the learning rate (step size).

Intuition

Imagine standing on a hilly landscape, trying to reach the lowest point. At each step, you look at the slope (gradient) under your feet and take a step downhill. With a small enough step size, you converge to a local minimum.

Convergence

For a convex function $f(x) = (x - x^*)^2$ , the update rule gives:

$x_{n+1} = x_n - \alpha \cdot 2(x_n - x^*) = (1 - 2\alpha)(x_n - x^*)+ x^*$

The error shrinks by factor $|1 - 2\alpha|$ at each step. For $\alpha = 0.1$ , the factor is $0.8$ — the error halves approximately every 3 steps.

Example

Minimize $f(x) = (x - 3)^2$ , so $f'(x) = 2(x-3)$ :

double df(double x) { return 2.0 * (x - 3.0); }

double x = 0.0;
for (int i = 0; i < 100; i++) {
    x = x - 0.1 * df(x);
}
printf("%.4f\n", x);  // 3.0000

Choosing the Learning Rate

Too large: diverges (overshoots the minimum)
Too small: converges very slowly
Just right: converges reliably

Your Task

Implement double grad_descent(df, start, lr, steps) that performs gradient descent on the function with derivative df.

← Previous Next →

TCC compiler loading...

Click "Run" to execute your code.