Error-bounded Sampling for Analytics on Big Sparse Data
Summary: Error-bounded stratified sampling for analytics on big sparse data with end-user accuracy guarantees. Leverages data distributions to drastically cut sample size (up to 99% smaller vs uniform) in a shared-nothing engine (SCOPE), enabling robust analytics on massive volumes. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ying Yan
- 2. Liang Jeff Chen
- 3. Zheng Zhang
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 11 of 11 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 14 | Online Aggregation | 1997 | SIGMOD | 0.0010801504 |
| 22 | SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets | 2008 | VLDB | 0.0008456613 |
| 739 | Congressional Samples for Approximate Answering of Group-By Queries | 2000 | SIGMOD | 0.00017401518 |
| 1,260 | Dynamic Sample Selection for Approximate Query Processing | 2003 | SIGMOD | 0.00012993347 |
| 1,335 | ICICLES: Self-tuning Samples for Approximate Query Answering | 2000 | VLDB | 0.00012502131 |
| 1,464 | Online Aggregation for Large MapReduce Jobs | 2011 | VLDB | 0.00011865546 |
| 1,909 | SciBORQ: Scientific data management with Bounds On Runtime and Quality | 2011 | CIDR | 0.00010121304 |
| 2,736 | Online Aggregation and Continuous Query support in MapReduce | 2010 | SIGMOD | 8.2043187e-05 |
| 3,279 | Early Accurate Results for Advanced Analytics on MapReduce | 2012 | VLDB | 7.2855494e-05 |
| 3,594 | Continuous Sampling for Online Aggregation Over Multiple Queries | 2010 | SIGMOD | 6.9381343e-05 |
| 4,093 | Distributed Online Aggregations | 2009 | VLDB | 6.4558147e-05 |
Previous
Page 1 / 1
Next