Lesson 15 of 18

Bootstrap Resampling

Resampling to Estimate Uncertainty

Bootstrap resampling is a powerful technique to estimate the uncertainty of any statistic — no mathematical formulas required.

The idea: repeatedly sample your data with replacement, compute your statistic each time, and look at the distribution of results.

import random

data = [1, 2, 3, 4, 5]
rng = random.Random(42)

# Draw 1000 bootstrap samples and compute means
boot_means = [
    sum(rng.choices(data, k=len(data))) / len(data)
    for _ in range(1000)
]
boot_means.sort()

# 95% bootstrap confidence interval (percentile method)
def pct(p):
    i = (p / 100) * (len(boot_means) - 1)
    lo = int(i)
    hi = min(lo + 1, len(boot_means) - 1)
    return boot_means[lo] + (i - lo) * (boot_means[hi] - boot_means[lo])

print(round(pct(2.5), 1), round(pct(97.5), 1))

Why Bootstrap?

  • Works for any statistic (median, sigmasigma, correlation, custom metrics)
  • No assumptions about the underlying distribution
  • Especially useful for small samples or unusual statistics

With Replacement

Sampling with replacement means each draw comes from the full original dataset. Some values appear multiple times, others don't appear. This mimics the randomness of collecting new samples.

Percentile Method

The percentile method extracts the CI directly from the bootstrap distribution:

  • Lower bound: 2.5th percentile of bootstrap means
  • Upper bound: 97.5th percentile of bootstrap means

The 95% bootstrap CI is [hat{ heta}_{2.5%},, hat{ heta}_{97.5%}], where hathetahat{ heta} is the statistic computed on each bootstrap sample.

Your Task

Implement bootstrap_ci(data, n_samples, seed) that returns a tuple (lower, upper) representing the 95% bootstrap confidence interval of the mean, rounded to 2 decimal places.

Pyodide loading...
Loading...
Click "Run" to execute your code.