Lesson 12 of 18

Chi-Square Test

Testing Categorical Data

The chi-square goodness-of-fit test checks whether observed frequencies match expected frequencies for a categorical variable.

import math

# Roll a die 100 times. Is it fair?
observed = [20, 15, 18, 22, 17, 8]  # 6 categories
n = len(observed)
total = sum(observed)
expected = total / n  # uniform: each category equally likely

chi2 = sum((o - expected)**2 / expected for o in observed)
print(round(chi2, 4))   # chi-square statistic

How It Works

For each category, compute:

chi^2 = sum_{i=1}^k rac{(O_i - E_i)^2}{E_i}

where OiO_i is the observed count and EiE_i is the expected count. If observed matches expected perfectly, chi2=0chi^2 = 0.

Degrees of Freedom

For kk categories: df=k1df = k - 1. More categories require a larger chi2chi^2 value for significance.

Custom Expected Frequencies

You can also test against non-uniform expectations by providing your own expected counts.

Your Task

Implement chi_square_test(observed) that tests whether the observed frequencies are uniformly distributed (equal expected frequency for each category). Print the chi2chi^2 statistic (rounded to 4 decimal places) and the pp-value (rounded to 4 decimal places).

Pyodide loading...
Loading...
Click "Run" to execute your code.