Confidence Intervals
Estimating the True Mean
A confidence interval (CI) gives a range of plausible values for the population mean, based on a sample.
ar{x} pm t^* cdot rac{s}{sqrt{n}}
import math, statistics
data = [1, 2, 3, 4, 5]
n = len(data)
mean = statistics.fmean(data)
sem = statistics.stdev(data) / math.sqrt(n)
# 95% CI: mean +/- t_critical * sem
# t_critical for df=4, 95% ~= 2.7764
t_crit = 2.7764
print(round(mean - t_crit * sem, 2)) # 1.04
print(round(mean + t_crit * sem, 2)) # 4.96
Interpretation
A 95% CI means: if you repeated this sampling process 100 times, approximately 95 of the resulting CIs would contain the true population mean .
Common misconception: It does NOT mean "there's a 95% chance the true mean is in this interval." The true mean is fixed; the interval is random.
Width of the CI
The CI gets narrower (more precise) when:
- Sample size increases
- Data variability decreases
- Confidence level decreases (e.g., 90% CI is narrower than 95%)
t vs z
We use the t-distribution (not normal) because we estimate from the sample. As , the t-distribution approaches the normal distribution.
Your Task
Implement confidence_interval(data, confidence) that prints the lower bound and upper bound of the confidence interval, each rounded to 2 decimal places.