Lesson 15 of 15

Distortion Measures

Distortion Measures

Rate-distortion theory studies the tradeoff between compression rate and reconstruction quality. Central to it are distortion measures that quantify how different two strings or sequences are.

Hamming Distance

The Hamming distance counts positions where two equal-length strings differ:

dH(x,y)=i=1n1[xiyi]d_H(x, y) = \sum_{i=1}^n \mathbf{1}[x_i \neq y_i]

Used to measure bit errors in digital communications.

Bit Error Rate

The BER normalizes Hamming distance by string length:

BER(x,y)=dH(x,y)n\text{BER}(x, y) = \frac{d_H(x, y)}{n}

Edit Distance (Levenshtein)

Edit distance is the minimum number of single-character insertions, deletions, or substitutions needed to transform string ss into string tt. Unlike Hamming distance, strings can have different lengths.

The classic dynamic programming solution:

dp[i][j]={iif j=0jif i=0dp[i1][j1]if s[i]=t[j]1+min(dp[i1][j], dp[i][j1], dp[i1][j1])otherwisedp[i][j] = \begin{cases} i & \text{if } j = 0 \\ j & \text{if } i = 0 \\ dp[i-1][j-1] & \text{if } s[i] = t[j] \\ 1 + \min(dp[i-1][j],\ dp[i][j-1],\ dp[i-1][j-1]) & \text{otherwise} \end{cases}

def edit_distance(s, t):
    m, n = len(s), len(t)
    dp = [[0] * (n+1) for _ in range(m+1)]
    for i in range(m+1): dp[i][0] = i
    for j in range(n+1): dp[0][j] = j
    for i in range(1, m+1):
        for j in range(1, n+1):
            if s[i-1] == t[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])
    return dp[m][n]

print(edit_distance('kitten', 'sitting'))  # 3

Your Task

Implement:

  • hamming_distance(x, y) — count differing positions (same-length strings)
  • bit_error_rate(x, y) — Hamming distance divided by length
  • edit_distance(s, t) — Levenshtein distance via dynamic programming
Python runtime loading...
Loading...
Click "Run" to execute your code.