Database Paper Browser

Back to papers

Machine Learning for Cloud Data Systems: the Progress so far and the Path Forward

Summary: Survey of ML for cloud data systems, outlining progress, practical deployments, and the gap between research promise and industry reality. Part II covers enterprise concerns: explanations, debugging, deployment, model management, data usage constraints, anonymization, and arising technical debt. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12529
Venue
VLDB
Year
2021
Pagerank
4.6872456e-05
Overall Rank
7,655 | 46.75%
DOI
10.14778/3476311.3476408

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 3 of 3 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
102 The Case for Learned Index Structures 2018 SIGMOD 0.00049545203
333 Neo: A Learned Query Optimizer 2019 VLDB 0.00027206884
1,019 Robust Estimation of Resource Consumption for SQL Queries using Statistical Techniques 2012 VLDB 0.00014625603
1,084 Dhalion: Self-Regulating Stream Processing in Heron 2017 VLDB 0.00014209714
1,254 Selectivity Estimation for Range Predicates using Lightweight Models 2019 VLDB 0.00013027411
1,703 Are We Ready For Learned Cardinality Estimation? 2021 VLDB 0.00010836769
1,855 AI Meets AI: Leveraging Query Executions to Improve Index Recommendations 2019 SIGMOD 0.00010315245
2,083 Towards a Learning Optimizer for Shared Clouds 2019 VLDB 9.5834572e-05
2,783 Flow-Loss: Learning Cardinality Estimates That Matter 2021 VLDB 8.1293383e-05
3,625 Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings 2020 SIGMOD 6.9055212e-05
3,789 DIAMetrics: Benchmarking Query Engines at Scale 2020 VLDB 6.7644737e-05
3,806 HedgeCut: Maintaining Randomised Trees for Low-Latency Machine Unlearning 2021 SIGMOD 6.7492837e-05
4,174 Computation Reuse in Analytics Job Service at Microsoft 2018 SIGMOD 6.3856219e-05
4,377 Understanding and Benchmarking the Impact of GDPR on Database Systems 2020 VLDB 6.2404627e-05
6,040 Steering Query Optimizers: A Practical Take on Big Data Workloads 2021 SIGMOD 5.2412035e-05
6,757 KEA: Tuning an Exabyte-Scale Data Infrastructure 2021 SIGMOD 4.9372134e-05
7,047 Seagull: An Infrastructure for Load Prediction and Optimized Resource Allocation 2021 VLDB 4.8521181e-05
7,684 AutoToken: Predicting Peak Parallelism for Big Data Analytics at Microsoft 2020 VLDB 4.6796855e-05
8,217 Spur: Mitigating Slow Instances in Large-Scale Streaming Pipelines 2020 SIGMOD 4.5568298e-05
8,220 PerfGuard: Deploying ML-for-Systems without Performance Regressions, Almost! 2021 VLDB 4.5557328e-05
9,194 Phoebe: A Learning-based Checkpoint Optimizer 2021 VLDB 4.3761777e-05
9,735 SparkCruise: Handsfree Computation Reuse in Spark 2019 VLDB 4.2942813e-05
Previous Page 1 / 1 Next

Semantically Similar Papers