Experiences with Approximating Queries in Microsoft’s Production Big-Data Clusters
Summary: Large-scale study of sampling-based approximation in Microsoft's production big-data clusters. Examines deployment choices, implementation trade-offs, and real-use cases; provides data-driven insights on when sampling yields useful analytic results and its limits in practice. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Srikanth Kandula
- 2. Kukjin Lee
- 3. Surajit Chaudhuri
- 4. Marc Friedman
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,261 | The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward | 2021 | VLDB | 5.1350714e-05 |
| 6,740 | Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing | 2021 | SIGMOD | 4.944395e-05 |
| 8,393 | LAQy: Efficient and Reusable Query Approximations via Lazy Sampling | 2023 | SIGMOD | 4.5280102e-05 |
| 8,643 | One Size Does Not Fit All: A Bandit-Based Sampler Combination Framework with Theoretical Guarantees | 2022 | SIGMOD | 4.4777916e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 15 of 15 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next