Streaming Algorithms for Robust Distinct Elements
Summary: Estimates the number of distinct entities in streams under a noisy model where items may map to the same entity. Introduces bucket sampling for Euclidean spaces, extends to metric spaces via LSH, and shows resilience to small ambiguity with practical validation. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,358 | Robust Statistical Analysis on Streaming Data with Near-Duplicates in General Metric Spaces | 2025 | PODS | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1 | Access Path Selection in a Relational Database Management System | 1979 | SIGMOD | 0.0040449103 |
| 322 | Record Linkage: Similarity Measures and Algorithms | 2006 | SIGMOD | 0.00027518768 |
| 383 | An Optimal Algorithm for the Distinct Elements Problem | 2010 | PODS | 0.00024820873 |
| 429 | The Aqua Approximate Query Answering System | 1999 | SIGMOD | 0.00023476494 |
| 727 | On Synopses for Distinct-Value Estimation Under Multiset Operations | 2007 | SIGMOD | 0.00017508726 |
| 2,452 | Data Fusion – Resolving Data Conflicts for Integration | 2009 | VLDB | 8.7839322e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,385 | Estimating Statistical Aggregates on Probabilistic Data Streams | 2007 | PODS | 7.1580391e-05 |
| 4,905 | Randomized Error Removal for Online Spread Estimation in Data Streaming | 2021 | VLDB | 5.8398332e-05 |
| 3,838 | Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters | 2006 | SIGMOD | 6.7134945e-05 |
| 5,117 | Sampling Algorithms in a Stream Operator | 2005 | SIGMOD | 5.6825418e-05 |
| 3,041 | Sketching Probabilistic Data Streams | 2007 | SIGMOD | 7.6697078e-05 |
| 4,403 | A Framework for Adversarially Robust Streaming Algorithms | 2020 | PODS | 6.2194225e-05 |
| 383 | An Optimal Algorithm for the Distinct Elements Problem | 2010 | PODS | 0.00024820873 |
| 12,108 | Space-Efficient Estimation of Statistics over Sub-Sampled Streams | 2012 | PODS | 4.1945683e-05 |
| 12,531 | Join-Distinct Aggregate Estimation over Update Streams | 2005 | PODS | 4.1945683e-05 |
| 10,358 | Robust Statistical Analysis on Streaming Data with Near-Duplicates in General Metric Spaces | 2025 | PODS | 4.1945683e-05 |