Lesson 3 of 18

Percentiles & IQR

Percentiles

The p-th percentile is the value below which pp% of the data falls.

def percentile(data, p):
    s = sorted(data)
    n = len(s)
    i = (p / 100) * (n - 1)
    lo = int(i)
    hi = min(lo + 1, n - 1)
    return s[lo] + (i - lo) * (s[hi] - s[lo])

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

q1 = percentile(data, 25)   # 25th percentile (Q1)
q3 = percentile(data, 75)   # 75th percentile (Q3)
iqr = q3 - q1               # Interquartile range

print(round(q1, 2))    # 3.25
print(round(q3, 2))    # 7.75
print(round(iqr, 2))   # 4.5

Interquartile Range (IQR)

The IQR is the range of the middle 50% of the data:

extIQR=Q3Q1 ext{IQR} = Q_3 - Q_1

It is resistant to outliers, making it a robust measure of spread.

Outlier Detection

A common rule: a value is an outlier if it falls outside the fences:

Q11.5imesextIQRquadextorquadQ3+1.5imesextIQRQ_1 - 1.5 imes ext{IQR} quad ext{or} quad Q_3 + 1.5 imes ext{IQR}

lower_fence = q1 - 1.5 * iqr
upper_fence = q3 + 1.5 * iqr

Median = 50th Percentile

percentile(data, 50) is equivalent to the median.

Your Task

Implement quartiles(data) that prints Q1Q_1, Q3Q_3, and IQR, each rounded to 2 decimal places.

Pyodide loading...
Loading...
Click "Run" to execute your code.