HyperLogLog
HyperLogLog — Approximate Counting
HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality (unique count) of a set using a fixed amount of memory — typically 12 KB — regardless of how many unique elements you add.
This makes it ideal for counting unique visitors, unique search queries, unique events, or any other "how many distinct things?" problem at scale.
The Trade-Off
HyperLogLog is approximate: it estimates cardinality with a standard error of about 0.81%. For most analytics use cases, this is perfectly acceptable.
Exact counting with a Set would require memory proportional to the number of unique items. HyperLogLog uses a fixed 12 KB no matter how many items you add.
Note: In this emulator, HyperLogLog uses an exact set internally, so counts are always precise. In real Redis, results are approximate.
PFADD — Add Elements
PFADD visitors "user1" "user2" "user3"
PFADD visitors "user1" "user2" -- duplicates ignored in counting
PFADD returns 1 if the HLL was modified (new unique element added), 0 if not.
PFCOUNT — Estimate Cardinality
PFADD visitors "user1" "user2" "user3" "user1"
PFCOUNT visitors -- (integer) 3 (only 3 unique)
PFMERGE — Merge Multiple HLLs
Combine multiple HyperLogLogs into one:
PFADD page:home "u1" "u2" "u3"
PFADD page:about "u2" "u3" "u4" "u5"
PFMERGE site:total page:home page:about
PFCOUNT site:total -- unique visitors across both pages
Output:
(integer) 1
(integer) 1
OK
(integer) 5
Real-World Use Cases
Daily unique visitors:
PFADD visitors:2024-01-15 "user1" "user2" "user3"
PFCOUNT visitors:2024-01-15
# Weekly total (merge 7 days)
PFMERGE visitors:week:3 visitors:2024-01-15 visitors:2024-01-16 ...
PFCOUNT visitors:week:3
Unique search queries:
PFADD searches:today "redis tutorial" "python redis" "redis tutorial"
PFCOUNT searches:today -- 2 unique queries
Unique API callers:
PFADD api:callers:hourly "client-123" "client-456" "client-123"
PFCOUNT api:callers:hourly
Why "PF"?
The "PF" prefix honors Philippe Flajolet, the French mathematician who invented the HyperLogLog algorithm in 2007.
Your Task
Track unique visitors across two pages:
- PFADD
page:homewith visitors:"u1","u2","u3","u4" - PFADD
page:docswith visitors:"u3","u4","u5" - PFMERGE
sitefrom both pages - PFCOUNT
siteto get total unique visitors