Lesson 18 of 18

One-Way ANOVA

One-Way ANOVA

Analysis of Variance (ANOVA) tests whether the means of three or more groups are equal. It works by comparing the variance between groups to the variance within groups.

Hypotheses:

  • H0H_0: all group means are equal (μ1=μ2==μk\mu_1 = \mu_2 = \cdots = \mu_k)
  • H1H_1: at least one group mean differs

The F-Statistic

F=MSbetweenMSwithin=SSbetween/(k1)SSwithin/(Nk)F = \frac{MS_{\text{between}}}{MS_{\text{within}}} = \frac{SS_{\text{between}} / (k-1)}{SS_{\text{within}} / (N-k)}

where:

  • kk = number of groups, NN = total observations
  • SSbetween=ini(xˉixˉ)2SS_{\text{between}} = \sum_i n_i (\bar{x}_i - \bar{x})^2 — variation due to group differences
  • SSwithin=ij(xijxˉi)2SS_{\text{within}} = \sum_i \sum_j (x_{ij} - \bar{x}_i)^2 — variation within groups

A large FF means the groups differ more than random variation would explain.

Example

Groups: [1,2,3][1,2,3], [4,5,6][4,5,6], [7,8,9][7,8,9]

SourceSSdfMSF
Between5422727.0
Within661

F=27.0F = 27.0

Implementation

def anova_f(groups):
    n_total = sum(len(g) for g in groups)
    k = len(groups)
    grand_mean = sum(sum(g) for g in groups) / n_total
    group_means = [sum(g)/len(g) for g in groups]
    ss_between = sum(len(g)*(m - grand_mean)**2
                     for g, m in zip(groups, group_means))
    ss_within = sum((x - m)**2
                    for g, m in zip(groups, group_means)
                    for x in g)
    ms_between = ss_between / (k - 1)
    ms_within = ss_within / (n_total - k)
    return round(ms_between / ms_within, 4)

Your Task

Implement anova_f(groups) that computes the F-statistic from a list of groups (each a list of numbers).

Pyodide loading...
Loading...
Click "Run" to execute your code.