Weighted Reservoir Sampling from Distributed Streams
Summary: First message-optimal algorithm for weighted sampling without replacement (weighted SWOR) from distributed streams, achieving optimal message, space, and time complexity and a nearly-tight message lower bound (up to log(1/ε)). Uses this primitive to give the first distributed algorithms for residual heavy hitters in skewed-weight streams and to tighten/improve message complexity bounds for distributed L1 (count) tracking, yielding matching lower bounds. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,172 | The Adversarial Robustness of Sampling | 2020 | PODS | 6.3879072e-05 |
| 8,610 | Efficient Dynamic Weighted Set Sampling and Its Extension | 2024 | VLDB | 4.4853485e-05 |
| 8,993 | LightRW: FPGA Accelerated Graph Dynamic Random Walks | 2023 | SIGMOD | 4.4130611e-05 |
| 10,353 | Perfect Sampling in Turnstile Streams Beyond Small Moments | 2025 | PODS | 4.1945683e-05 |
| 11,320 | Truly Perfect Samplers for Data Streams and Sliding Windows | 2022 | PODS | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 308 | Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports | 2001 | VLDB | 0.00028142852 |
| 745 | Distributed Top-K Monitoring | 2003 | SIGMOD | 0.00017330487 |
| 1,094 | Tight Bounds for Lp Samplers, Finding Duplicates in Streams, and Related Problems | 2011 | PODS | 0.00014129658 |
| 1,392 | Sketching Streams Through the Net: Distributed Approximate Query Tracking | 2005 | VLDB | 0.00012229045 |
| 1,629 | Space-optimal Heavy Hitters with Strong Error Bounds | 2009 | PODS | 0.00011085267 |
| 1,640 | Communication-Efficient Distributed Monitoring of Thresholded Counts | 2006 | SIGMOD | 0.0001104808 |
| 2,878 | Sampling Time-Based Sliding Windows in Bounded Space | 2008 | SIGMOD | 7.9706235e-05 |
| 4,190 | Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks | 2012 | PODS | 6.3739017e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,789 | Optimal Sampling from Sliding Windows | 2009 | PODS | 8.1249652e-05 |
| 4,350 | On Biased Reservoir Sampling in the Presence of Stream Evolution | 2006 | VLDB | 6.2645054e-05 |
| 12,344 | Composable, Scalable, and Accurate Weight Summarization of Unaggregated Data Sets | 2009 | VLDB | 4.1945683e-05 |
| 11,823 | Variability in Data Streams | 2016 | PODS | 4.1945683e-05 |
| 5,415 | Coordinated Weighted Sampling for Estimating Aggregates Over Multiple Weight Assignments | 2009 | VLDB | 5.5196338e-05 |
| 12,108 | Space-Efficient Estimation of Statistics over Sub-Sampled Streams | 2012 | PODS | 4.1945683e-05 |
| 4,905 | Randomized Error Removal for Online Spread Estimation in Data Streaming | 2021 | VLDB | 5.8398332e-05 |
| 5,117 | Sampling Algorithms in a Stream Operator | 2005 | SIGMOD | 5.6825418e-05 |
| 4,249 | Optimal Tracking of Distributed Heavy Hitters and Quantiles | 2009 | PODS | 6.3245666e-05 |
| 2,080 | Optimal Sampling From Distributed Streams | 2010 | PODS | 9.5899129e-05 |