Maintaining Bernoulli Samples over Evolving Multisets
Summary: Maintains Bernoulli samples over arbitrary insertion/deletion streams on multisets (duplicates allowed) without accessing the base multiset, extending Gibbons–Matias counting-sample technique. Tracking counters yield unbiased, lower-variance estimators for frequencies, sums, averages and distinct counts, and enable practical subsampling/merging for memory-limited or distributed settings. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Rainer Gemulla
- 2. Wolfgang Lehner
- 3. Peter J. Haas
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,203 | Independent Range Sampling | 2014 | PODS | 9.2981095e-05 |
| 7,415 | Efficient and Scalable Statistics Gathering for Large Databases in Oracle 11g | 2008 | SIGMOD | 4.7355557e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 46 | Simple Random Sampling from Relational Databases | 1986 | VLDB | 0.00070894702 |
| 184 | New Sampling-Based Summary Statistics for Improving Approximate Query Answers | 1998 | SIGMOD | 0.00036625711 |
| 308 | Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports | 2001 | VLDB | 0.00028142852 |
| 2,282 | Summarizing and Mining Inverse Distributions on Data Streams via Dynamic Inverse Sampling | 2005 | VLDB | 9.1073603e-05 |
| 2,368 | Online Maintenance of Very Large Random Samples | 2004 | SIGMOD | 8.9501526e-05 |
| 6,286 | A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets | 2006 | VLDB | 5.1280225e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,959 | Reservoir Sampling over Joins | 2024 | SIGMOD | 4.4206222e-05 |
| 5,117 | Sampling Algorithms in a Stream Operator | 2005 | SIGMOD | 5.6825418e-05 |
| 12,108 | Space-Efficient Estimation of Statistics over Sub-Sampled Streams | 2012 | PODS | 4.1945683e-05 |
| 1,369 | Random Sampling over Joins Revisited | 2018 | SIGMOD | 0.00012339777 |
| 269 | Fast Incremental Maintenance of Approximate Histograms | 1997 | VLDB | 0.00029656549 |
| 46 | Simple Random Sampling from Relational Databases | 1986 | VLDB | 0.00070894702 |
| 4,350 | On Biased Reservoir Sampling in the Presence of Stream Evolution | 2006 | VLDB | 6.2645054e-05 |
| 2,368 | Online Maintenance of Very Large Random Samples | 2004 | SIGMOD | 8.9501526e-05 |
| 4,100 | A Bi-Level Bernoulli Scheme for Database Sampling | 2004 | SIGMOD | 6.4531387e-05 |
| 6,286 | A Dip in the Reservoir: Maintaining Sample Synopses of Evolving Datasets | 2006 | VLDB | 5.1280225e-05 |