Towards Estimation Error Guarantees for Distinct Values
Summary: Prove any sublinear-sample estimator for distinct counts must suffer large error on some natural distributions unless it reads a large fraction of the data. Give an estimator matching this lower bound and practical heuristics for typical distributions, validated empirically. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Moses Charikar
- 2. Surajit Chaudhuri
- 3. Rajeev Motwani
- 4. Vivek Narasayya
Incoming Citations (Sorted by Pagerank)
Showing 9 of 59 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 39 | Statistical Estimators for Relational Algebra Expressions | 1988 | PODS | 0.00074745564 |
| 59 | Sampling-Based Estimation of the Number of Distinct Values of an Attribute | 1995 | VLDB | 0.00064501896 |
| 64 | Improved Histograms for Selectivity Estimation of Range Predicates | 1996 | SIGMOD | 0.00063612837 |
| 134 | Processing Aggregate Relational Queries with Hard Time Constraints | 1989 | SIGMOD | 0.00042452811 |
| 237 | An Efficient, Cost-Driven Index Selection Tool for Microsoft SQL Server | 1997 | VLDB | 0.00031726304 |
Previous
Page 1 / 1
Next