Database Paper Browser

Back to papers

An Optimal Algorithm for the Distinct Elements Problem

Summary: First algorithm achieving information-theoretically optimal space O(ε^-2 + log n) bits for (1±ε)-approximate distinct elements in data streams, closing decades of work. Also attains worst-case O(1) update and reporting time and extends to Hamming-norm estimation. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
1506
Venue
PODS
Year
2010
Pagerank
0.00024820873
Overall Rank
383 | 97.34%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 27 of 27 citing papers.

Rank Citing Paper Year Venue Pagerank
402 Mergeable Summaries 2012 PODS 0.00024196343
1,040 Graph Sketches: Sparsification, Spanners, and Subgraphs 2012 PODS 0.00014488943
1,094 Tight Bounds for Lp Samplers, Finding Duplicates in Streams, and Related Problems 2011 PODS 0.00014129658
2,805 All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis 2014 PODS 8.0918347e-05
2,894 Pan-private Algorithms Via Statistics on Sketches 2011 PODS 7.9474698e-05
3,708 Is Min-Wise Hashing Optimal for Summarizing Set Intersection? 2014 PODS 6.8247903e-05
4,403 A Framework for Adversarially Robust Streaming Algorithms 2020 PODS 6.2194225e-05
4,833 MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions 2019 SIGMOD 5.8916346e-05
5,332 Persistent Bloom Filter: Membership Testing for the Entire History 2018 SIGMOD 5.5662513e-05
5,902 The Communication Complexity of Distributed Set-Joins with Applications to Matrix Multiplication 2015 PODS 5.2796864e-05
5,909 At-the-time and Back-in-time Persistent Sketches 2021 SIGMOD 5.2769377e-05
6,244 Approximate Distinct Counts for Billions of Datasets 2019 SIGMOD 5.139669e-05
6,889 Better Cardinality Estimators for HyperLogLog, PCSA, and Beyond 2023 PODS 4.893581e-05
7,145 Tight Trade-offs for the Maximum k-Coverage Problem in the General Streaming Model 2019 PODS 4.8179617e-05
7,358 Weighted Distinct Sampling: Cardinality Estimation for SPJ Queries 2021 SIGMOD 4.7529363e-05
7,496 Subspace Exploration: Bounds on Projected Frequency Estimation 2021 PODS 4.7180617e-05
8,901 On the Feasibility of Forgetting in Data Streams 2024 PODS 4.427232e-05
10,008 Representation Obliviousness and Pseudodeterminism in Streaming Algorithms 2026 PODS 4.1945683e-05
10,012 A Fast, Mergeable, and LDP Compatible Sketch for Counting the Number of Distinct Values in Fully Dynamic Tables 2026 SIGMOD 4.1945683e-05
10,358 Robust Statistical Analysis on Streaming Data with Near-Duplicates in General Metric Spaces 2025 PODS 4.1945683e-05
11,168 Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation 2023 PODS 4.1945683e-05
11,169 Applications of Sketching and Pathways to Impact 2023 PODS 4.1945683e-05
11,329 Estimation of the Size of Union of Delphic Sets: Achieving Independence from Stream Size 2022 PODS 4.1945683e-05
11,433 Model Counting meets F0 Estimation 2021 PODS 4.1945683e-05
11,442 Estimating the Size of Union of Sets in Streaming Models 2021 PODS 4.1945683e-05
11,765 Streaming Algorithms for Measuring H-Impact 2017 PODS 4.1945683e-05
11,833 Streaming Algorithms for Robust Distinct Elements 2016 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers