Database Paper Browser

Back to papers

Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches

Summary: Assesses DL on DB-resident data via four canonical approaches, with MOP highlighted, and notes no single best method. Prototype on Greenplum; DL workloads reveal a Pareto frontier among speed, governance, and portability, guiding DL-support design with open-source artifacts. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12362
Venue
VLDB
Year
2021
Pagerank
6.087611e-05
Overall Rank
4,557 | 68.30%
DOI
10.14778/3467861.3467867

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 14 of 14 citing papers.

Rank Citing Paper Year Venue Pagerank
3,407 End-to-end Optimization of Machine Learning Prediction Queries 2022 SIGMOD 7.1295646e-05
4,924 User-Defined Operators: Efficiently Integrating Custom Algorithms into Modern Databases 2022 VLDB 5.822682e-05
5,084 In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle 2022 SIGMOD 5.7091191e-05
6,327 The Tensor Data Platform: Towards an AI-centric Database System 2023 CIDR 5.1083405e-05
6,796 InferDB: In-Database Machine Learning Inference Using Indexes 2024 VLDB 4.9241624e-05
6,884 Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines 2023 VLDB 4.8955332e-05
7,656 Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets 2022 SIGMOD 4.6871575e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,320 Powering In-Database Dynamic Model Slicing for Structured Data Analytics 2024 VLDB 4.3556432e-05
9,596 Scalable Graph Convolutional Network Training on Distributed-Memory Systems 2023 VLDB 4.319218e-05
9,806 The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format 2024 SIGMOD 4.2805224e-05
10,976 StarfishDB: a Query Execution Engine for Relational Probabilistic Programming 2024 SIGMOD 4.1945683e-05
10,998 Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems 2024 VLDB 4.1945683e-05
13,171 Reimagining Deep Learning Systems Through the Lens of Data Systems 2024 VLDB -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 30 of 30 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
140 The MADlib Analytics Library or MAD Skills, the SQL 2012 VLDB 0.00042270404
557 SystemML: Declarative Machine Learning on Spark 2016 VLDB 0.00020197988
658 Towards a Unified Architecture for in-RDBMS Analytics 2012 SIGMOD 0.00018506577
683 Cerebro: A Data System for Optimized Deep Learning Model Selection 2020 VLDB 0.00018195476
921 Democratizing Data Science through Interactive Curation of ML Pipelines 2019 SIGMOD 0.00015337438
1,377 Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics 2021 CIDR 0.00012296941
1,402 Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML 2014 VLDB 0.00012180605
1,666 HELIX: Holistic Optimization for Accelerating Iterative Machine Learning 2019 VLDB 0.0001096361
1,967 Compressed Linear Algebra for Large-Scale Machine Learning 2016 VLDB 9.9131712e-05
2,350 An Intermediate Representation for Optimizing Machine Learning Pipelines 2019 VLDB 8.9788641e-05
2,440 FlexPS: Flexible Parallelism Control in Parameter Server Architecture 2018 VLDB 8.8119143e-05
2,642 Vertica-ML: Distributed Machine Learning in Vertica Database 2020 SIGMOD 8.3851878e-05
2,804 Extending Relational Query Processing with ML Inference 2020 CIDR 8.0935487e-05
2,886 VISTA: Optimized System for Declarative Feature Transfer from Deep CNNs at Scale 2020 SIGMOD 7.9612767e-05
2,934 AIDA - Abstraction for Advanced In-Database Analytics 2018 VLDB 7.8595778e-05
3,099 DB4ML – An In-Memory Database Kernel with Machine Learning Support 2020 SIGMOD 7.5642871e-05
3,363 CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers 2019 VLDB 7.1731921e-05
3,875 Cloudy with High Chance of DBMS: A 10-year Prediction for Enterprise-Grade ML 2020 CIDR 6.675257e-05
3,918 On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML 2018 VLDB 6.6315176e-05
4,748 Rafiki: Machine Learning as an Analytics Service System 2019 VLDB 5.9526539e-05
4,964 PS2: Parameter Server on Spark 2019 SIGMOD 5.7965988e-05
5,294 GLADE: Big Data Analytics Made Easy 2012 SIGMOD 5.5810654e-05
5,821 Tensor Relational Algebra for Distributed Machine Learning System Design 2021 VLDB 5.3134851e-05
6,053 Optimizing Machine Learning Workloads in Collaborative Environments 2020 SIGMOD 5.2326838e-05
6,471 Dynamic Parameter Allocation in Parameter Servers 2020 VLDB 5.0511668e-05
6,538 Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent 2019 SIGMOD 5.023239e-05
7,138 Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization 2019 VLDB 4.8216981e-05
8,399 UDA-GIST: An In-database Framework to Unify Data-Parallel and State-Parallel Analytics 2015 VLDB 4.5257744e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,588 DISIMA: A Distributed and Interoperable Image Database System 2000 SIGMOD 4.3210611e-05
Previous Page 1 / 1 Next

Semantically Similar Papers