Database Paper Browser

Back to papers

On Random Sampling over Joins

Summary: Random sampling over joins: feasibility of sampling join outputs without full evaluation; theoretical limits on efficiency. Proposes new join-sampling algorithms for settings where limits don't apply; empirical evaluation on SQL Server 7.0 shows efficiency gains. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3105
Venue
SIGMOD
Year
1999
Pagerank
0.00092385438
Overall Rank
18 | 99.88%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 36 of 86 citing papers.

Rank Citing Paper Year Venue Pagerank
5,815 StatAdvisor: Recommending Statistical Views 2009 VLDB 5.3165295e-05
5,906 Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results 2005 VLDB 5.2787348e-05
5,951 PGMJoins: Random Join Sampling with Graphical Models 2021 SIGMOD 5.2592385e-05
5,976 Responsible Data Integration: Next-generation Challenges 2022 SIGMOD 5.245976e-05
6,136 Scalable Progressive Analytics on Big Data in the Cloud 2013 VLDB 5.1928748e-05
6,161 Spatial Join Selectivity Using Power Laws 2000 SIGMOD 5.1753664e-05
6,286 A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets 2006 VLDB 5.1280225e-05
6,493 Joins on Samples: A Theoretical Guide for Practitioners 2020 VLDB 5.0424713e-05
6,548 Query Sampling in DB2 Universal Database 2004 SIGMOD 5.0181595e-05
6,853 On Joining and Caching Stochastic Streams 2005 SIGMOD 4.9070864e-05
6,874 ROX: Run-time Optimization of XQueries 2009 SIGMOD 4.8978984e-05
7,150 Histograms Revisited: When are histograms the best approximation method for aggregates over joins? 2005 PODS 4.8163484e-05
7,251 Learning to Sample: Counting with Complex Queries 2020 VLDB 4.7890519e-05
7,581 Synopses for Query Optimization: A Space-Complexity Perspective 2004 PODS 4.7057641e-05
7,714 Identifying Insufficient Data Coverage in Databases with Multiple Relations 2020 VLDB 4.6700455e-05
7,728 Consistent Histograms In The Presence of Distinct Value Counts 2009 VLDB 4.666214e-05
8,320 Effective Change Detection Using Sampling 2002 VLDB 4.5435639e-05
8,350 alpha to omega: The Greek Alphabet of Sampling 2020 CIDR 4.5404832e-05
8,610 Efficient Dynamic Weighted Set Sampling and Its Extension 2024 VLDB 4.4853485e-05
8,959 Reservoir Sampling over Joins 2024 SIGMOD 4.4206222e-05
9,118 Towards Observability for Production Machine Learning Pipelines 2022 VLDB 4.3928288e-05
9,621 ShadowAQP: Efficient Approximate Group-by and Join Query via Attribute-oriented Sample Size Allocation and Data Generation 2023 VLDB 4.3167167e-05
9,696 The Data Interaction Game 2018 SIGMOD 4.3023337e-05
9,798 Threshold Queries in Theory and in the Wild 2022 VLDB 4.2818172e-05
9,848 Saving Money for Analytical Workloads in the Cloud 2024 VLDB 4.2721228e-05
9,886 Scalable and Usable Relational Learning With Automatic Language Bias 2021 SIGMOD 4.2621158e-05
9,949 AB-tree: Index for Concurrent Random Sampling and Updates 2022 VLDB 4.2421586e-05
10,227 Sample-based Distinct Cardinality Estimation for Multiple Attributes in Multi-Dataset Queries 2026 VLDB 4.1945683e-05
10,254 Secure Multi-Party Sampling over Joins 2026 VLDB 4.1945683e-05
10,359 Smallest Synthetic Witnesses for Conjunctive Queries 2025 PODS 4.1945683e-05
10,497 PilotDB: Database-Agnostic Online Approximate Query Processing with A Priori Error Guarantees 2025 SIGMOD 4.1945683e-05
10,927 Computing A Well-Representative Summary of Conjunctive Query Results 2024 PODS 4.1945683e-05
10,981 Enabling Adaptive Sampling for Intra-Window Join: Simultaneously Optimizing Quantity and Quality 2024 SIGMOD 4.1945683e-05
11,453 XLJoins 2021 SIGMOD 4.1945683e-05
11,539 FlashP: An Analytical Pipeline for Real-time Forecasting of Time-Series Relational Data 2021 VLDB 4.1945683e-05
12,344 Composable, Scalable, and Accurate Weight Summarization of Unaggregated Data Sets 2009 VLDB 4.1945683e-05
Previous Page 2 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
1,255 Fixed-Precision Estimation of Join Selectivity 1993 PODS 0.00013024064
4,953 On Join Sampling and the Hardness of Combinatorial Output-Sensitive Join Algorithms 2023 PODS 5.8085795e-05
8,470 Sampling Big Ideas in Query Optimization 2023 PODS 4.5038423e-05
4,694 Scalable Reservoir Sampling on Many-Core CPUs 2019 SIGMOD 5.9944898e-05
2,254 Two-Level Sampling for Join Size Estimation 2017 SIGMOD 9.1897043e-05
92 Practical Selectivity Estimation through Adaptive Sampling 1990 SIGMOD 0.00051315959
6,493 Joins on Samples: A Theoretical Guide for Practitioners 2020 VLDB 5.0424713e-05
8,959 Reservoir Sampling over Joins 2024 SIGMOD 4.4206222e-05
1,369 Random Sampling over Joins Revisited 2018 SIGMOD 0.00012339777
46 Simple Random Sampling from Relational Databases 1986 VLDB 0.00070894702