UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
Summary: UltraLogLog keeps HLL's mergeable, idempotent, commutative properties and O(1) inserts while encoding equivalent distinct-count information in ~28% less space via an MLE-extractable representation. A faster estimator still cuts space by ~24% (17% with martingale), uses 8-bit registers for better compression, and ships as a production Java implementation. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Otmar Ertl
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,012 | A Fast, Mergeable, and LDP Compatible Sketch for Counting the Number of Distinct Values in Fully Dynamic Tables | 2026 | SIGMOD | 4.1945683e-05 |
| 10,498 | PLM4NDV: Minimizing Data Access for Number of Distinct Values Estimation with Pre-trained Language Models | 2025 | SIGMOD | 4.1945683e-05 |
| 10,534 | AdaNDV: Adaptive Number of Distinct Value Estimation via Learning to Select and Fuse Estimators | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,805 | All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis | 2014 | PODS | 8.0918347e-05 |
| 3,702 | Every Row Counts: Combining Sketches and Sampling for Accurate Group-By Result Estimates | 2019 | CIDR | 6.8295759e-05 |
| 5,361 | Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches | 2018 | VLDB | 5.547935e-05 |
| 6,244 | Approximate Distinct Counts for Billions of Datasets | 2019 | SIGMOD | 5.139669e-05 |
| 6,889 | Better Cardinality Estimators for HyperLogLog, PCSA, and Beyond | 2023 | PODS | 4.893581e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,364 | Query Log Compression for Workload Analytics | 2019 | VLDB | 4.5357797e-05 |
| 6,889 | Better Cardinality Estimators for HyperLogLog, PCSA, and Beyond | 2023 | PODS | 4.893581e-05 |
| 12,531 | Join-Distinct Aggregate Estimation over Update Streams | 2005 | PODS | 4.1945683e-05 |
| 7,430 | Adaptive Log Compression for Massive Log Data | 2013 | SIGMOD | 4.7317713e-05 |
| 4,966 | Relative Error Streaming Quantiles | 2021 | PODS | 5.7959749e-05 |
| 7,515 | Logging Every Footstep: Quantile Summaries for the Entire History | 2010 | SIGMOD | 4.7180617e-05 |
| 126 | Space-Efficient Online Computation of Quantile Summaries | 2001 | SIGMOD | 0.00044744986 |
| 5,200 | SetSketch: Filling the Gap between MinHash and HyperLogLog | 2021 | VLDB | 5.6337581e-05 |
| 5,361 | Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches | 2018 | VLDB | 5.547935e-05 |
| 6,244 | Approximate Distinct Counts for Billions of Datasets | 2019 | SIGMOD | 5.139669e-05 |