Lesson 12 of 18

Least Squares

Least Squares: Fitting a Line

When you have data points and want to fit a line y=mx+cy = mx + c, the least squares method finds the slope and intercept that minimize the sum of squared residuals.

Normal Equations

The slope and intercept can be computed directly using the normal equations:

def fit_line(x_vals, y_vals):
    n = len(x_vals)
    sx = sum(x_vals)
    sy = sum(y_vals)
    sxy = sum(x*y for x, y in zip(x_vals, y_vals))
    sxx = sum(x**2 for x in x_vals)

    slope = (n*sxy - sx*sy) / (n*sxx - sx**2)
    intercept = (sy - slope*sx) / n
    return [round(slope, 1), round(intercept, 1)]

x = [0, 1, 2, 3, 4]
y = [1, 3, 5, 7, 9]  # y = 2x + 1

print(fit_line(x, y))  # [2.0, 1.0]

When Data is Noisy

Even with noisy data, least squares finds the best-fit line — the one that minimizes the total vertical distance (squared) from the points to the line.

The Normal Equations (Matrix Form)

For overdetermined systems with matrix mathbfAmathbf{A} and vector mathbfbmathbf{b}, least squares minimizes lVertmathbfAmathbfxmathbfbVert2lVert mathbf{A}mathbf{x} - mathbf{b} Vert^2. The solution satisfies:

mathbfATmathbfAmathbfx=mathbfATmathbfbmathbf{A}^T mathbf{A} mathbf{x} = mathbf{A}^T mathbf{b}

Your Task

Implement fit_line(x_vals, y_vals) that returns [slope, intercept] rounded to 1 decimal place.

Pyodide loading...
Loading...
Click "Run" to execute your code.