SetSketch: Filling the Gap between MinHash and HyperLogLog
Summary: SetSketch bridges MinHash and HyperLogLog with a commutative, idempotent insert and mergeable state for distributed sketches. It delivers fast, robust estimators for cardinality and joint quantities, enables similarity search, and its joint estimator often outperforms state-of-the-art on related structures. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Otmar Ertl
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,451 | Efficient framework for operating on data sketches | 2023 | VLDB | 4.5086031e-05 |
| 9,038 | OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates | 2024 | VLDB | 4.4039656e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 400 | Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search | 2007 | VLDB | 0.0002427237 |
| 475 | Mining Database Structure; Or, How to Build a Data Quality Browser | 2002 | SIGMOD | 0.00022303253 |
| 2,141 | LSH Ensemble: Internet-Scale Domain Search | 2016 | VLDB | 9.4542625e-05 |
| 2,805 | All-Distances Sketches, Revisited: HIP Estimators for Massive Graphs Analysis | 2014 | PODS | 8.0918347e-05 |
| 3,702 | Every Row Counts: Combining Sketches and Sampling for Accurate Group-By Result Estimates | 2019 | CIDR | 6.8295759e-05 |
| 5,361 | Efficient Estimation of Inclusion Coefficient using HyperLogLog Sketches | 2018 | VLDB | 5.547935e-05 |
Previous
Page 1 / 1
Next