Data Management in Machine Learning: Challenges, Techniques, and Systems
Summary: Survey of data-management challenges and systems for ML workloads. Three lines of work: integrating ML with DBMS; adapting DB techniques to ML (queries, partitioning, compression); and combining data-management with ML lifecycles, plus open directions. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Arun Kumar
- 2. Matthias Boehm
- 3. Jun Yang
Incoming Citations (Sorted by Pagerank)
Showing 31 of 31 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 63 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,835 | Is Data Management the Beating Heart of AI Systems? | 2022 | SIGMOD | 4.2747054e-05 |
| 4,003 | Data Platform for Machine Learning | 2019 | SIGMOD | 6.54347e-05 |
| 939 | Data Lake Management: Challenges and Opportunities | 2019 | VLDB | 0.00015187344 |
| 7,655 | Machine Learning for Cloud Data Systems: the Progress so far and the Path Forward | 2021 | VLDB | 4.6872456e-05 |
| 7,020 | LLM for Data Management | 2024 | VLDB | 4.8595728e-05 |
| 8,346 | Deep Learning: Systems and Responsibility | 2021 | SIGMOD | 4.5420668e-05 |
| 10,843 | Machine Learning for Graph Data Management and Query Processing | 2025 | VLDB | 4.1945683e-05 |
| 8,637 | Machine Learning for Data Management: Problems and Solutions | 2018 | SIGMOD | 4.479892e-05 |
| 1,420 | Data Management Challenges in Production Machine Learning | 2017 | SIGMOD | 0.00012057956 |
| 4,906 | Machine Learning for Big Data | 2013 | SIGMOD | 5.8389053e-05 |