Lesson 10 of 15
Shannon Entropy
Shannon Entropy
Information theory, founded by Claude Shannon in 1948, gives us a rigorous mathematical framework for quantifying uncertainty, randomness, and the capacity to communicate.
Shannon Entropy
Given a discrete probability distribution , the Shannon entropy is:
- A fair coin () has bit
- A fair die ( for ) has bits
- A certain outcome () has bits
- The maximum entropy for outcomes is (achieved by the uniform distribution)
By convention, (zero-probability outcomes are skipped).
Joint Entropy and Mutual Information
For a joint distribution , the joint entropy is:
The marginal distributions are obtained by summing over the other variable:
Mutual information measures how much knowing reduces uncertainty about :
If and are independent, and therefore .
KL Divergence
The Kullback-Leibler divergence (relative entropy) measures how different distribution is from a reference :
Properties:
- always (equality iff )
- Not symmetric: in general
Your Task
Implement four functions:
shannon_entropy(probs)— compute in bits; skip zero entriesjoint_entropy(joint_probs)— compute from a 2D list of joint probabilitiesmutual_information(joint_probs)— compute from a 2D list of joint probabilitieskl_divergence(p, q)— compute in bits
Python runtime loading...
Loading...
Click "Run" to execute your code.