On the Relative Cost of Sampling for Join Selectivity Estimation
Summary: Quantifies when sampling (t_cross) is cheaper than computing the exact star-join for selectivity estimation, deriving bounds and approximations for relative cost as functions of input relation sizes, arity, and the estimator's precision criterion. Identifies dangling tuples as a major negative factor and characterizes mixed effects of data skew, yielding concrete regimes and thresholds that indicate when sampling is or isn't cost-effective for join selectivity estimation. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 11 of 11 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 39 | Statistical Estimators for Relational Algebra Expressions | 1988 | PODS | 0.00074745564 |
| 46 | Simple Random Sampling from Relational Databases | 1986 | VLDB | 0.00070894702 |
| 92 | Practical Selectivity Estimation through Adaptive Sampling | 1990 | SIGMOD | 0.00051315959 |
| 134 | Processing Aggregate Relational Queries with Hard Time Constraints | 1989 | SIGMOD | 0.00042452811 |
| 315 | Error-Constrained COUNT Query Evaluation in Relational Databases | 1991 | SIGMOD | 0.0002802103 |
| 357 | Random Sampling from B+ trees | 1989 | VLDB | 0.00026020098 |
| 367 | Sequential Sampling Procedures For Query Size Estimation | 1992 | SIGMOD | 0.00025509745 |
| 688 | Estimating the Size of Generalized Transitive Closures | 1989 | VLDB | 0.00018134733 |
| 762 | Query Size Estimation by Adaptive Sampling (Extended Abstract) | 1990 | PODS | 0.00017036868 |
| 783 | Random Sampling from Hash Files | 1990 | SIGMOD | 0.00016704834 |
| 1,255 | Fixed-Precision Estimation of Join Selectivity | 1993 | PODS | 0.00013024064 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,105 | Cardinality Estimation Done Right: Index-Based Join Sampling | 2017 | CIDR | 0.00013990395 |
| 6,493 | Joins on Samples: A Theoretical Guide for Practitioners | 2020 | VLDB | 5.0424713e-05 |
| 762 | Query Size Estimation by Adaptive Sampling (Extended Abstract) | 1990 | PODS | 0.00017036868 |
| 11,446 | Index-Based Join Size Estimation Using Adaptive Sampling | 2021 | SIGMOD | 4.1945683e-05 |
| 18 | On Random Sampling over Joins | 1999 | SIGMOD | 0.00092385438 |
| 2,254 | Two-Level Sampling for Join Size Estimation | 2017 | SIGMOD | 9.1897043e-05 |
| 1,193 | Join Size Estimation Subject to Filter Conditions | 2015 | VLDB | 0.00013414989 |
| 549 | Tracking Join and Self-Join Sizes in Limited Storage | 1999 | PODS | 0.00020376603 |
| 92 | Practical Selectivity Estimation through Adaptive Sampling | 1990 | SIGMOD | 0.00051315959 |
| 1,255 | Fixed-Precision Estimation of Join Selectivity | 1993 | PODS | 0.00013024064 |