Online Maintenance of Very Large Random Samples
Summary: Online maintenance of very large on-disk samples from streaming data. Presents online, single-pass algorithms that maintain true random samples (without replacement) of all data seen so far at gigabyte–terabyte scales, suitable for biased or unequal probability sampling. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,475 | Online Maintenance of Very Large Random Samples on Flash Storage | 2008 | VLDB | 0.00011806921 |
| 1,574 | Approximate Query Processing: No Silver Bullet | 2017 | SIGMOD | 0.00011287495 |
| 3,594 | Continuous Sampling for Online Aggregation Over Multiple Queries | 2010 | SIGMOD | 6.9381343e-05 |
| 3,805 | Approximate MaxRS in Spatial Databases | 2013 | VLDB | 6.7521192e-05 |
| 4,029 | Spatial Online Sampling and Aggregation | 2016 | VLDB | 6.51315e-05 |
| 4,172 | The Adversarial Robustness of Sampling | 2020 | PODS | 6.3879072e-05 |
| 5,906 | Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results | 2005 | VLDB | 5.2787348e-05 |
| 6,190 | Maintaining Bernoulli Samples over Evolving Multisets | 2007 | PODS | 5.1645517e-05 |
| 6,286 | A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets | 2006 | VLDB | 5.1280225e-05 |
| 9,632 | External Memory Stream Sampling | 2015 | PODS | 4.313481e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 269 | Fast Incremental Maintenance of Approximate Histograms | 1997 | VLDB | 0.00029656549 |
| 4,694 | Scalable Reservoir Sampling on Many-Core CPUs | 2019 | SIGMOD | 5.9944898e-05 |
| 8,470 | Sampling Big Ideas in Query Optimization | 2023 | PODS | 4.5038423e-05 |
| 4,350 | On Biased Reservoir Sampling in the Presence of Stream Evolution | 2006 | VLDB | 6.2645054e-05 |
| 6,190 | Maintaining Bernoulli Samples over Evolving Multisets | 2007 | PODS | 5.1645517e-05 |
| 46 | Simple Random Sampling from Relational Databases | 1986 | VLDB | 0.00070894702 |
| 443 | Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets | 1999 | SIGMOD | 0.00022996573 |
| 5,117 | Sampling Algorithms in a Stream Operator | 2005 | SIGMOD | 5.6825418e-05 |
| 1,475 | Online Maintenance of Very Large Random Samples on Flash Storage | 2008 | VLDB | 0.00011806921 |
| 6,286 | A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets | 2006 | VLDB | 5.1280225e-05 |