Database Paper Browser

Back to papers

Hybrid Parallelization Strategies for Large-Scale Machine Learning in SystemML

Summary: Hybrid parallelization strategies for large-scale ML on MapReduce in SystemML; combines data and task parallelism. Cost-based optimization automatically yields optimal parallel plans and adapts to ad-hoc workloads and data characteristics using ParFOR. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10933
Venue
VLDB
Year
2014
Pagerank
0.00012180605
Overall Rank
1,402 | 90.25%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 33 of 33 citing papers.

Rank Citing Paper Year Venue Pagerank
557 SystemML: Declarative Machine Learning on Spark 2016 VLDB 0.00020197988
683 Cerebro: A Data System for Optimized Deep Learning Model Selection 2020 VLDB 0.00018195476
834 Learning Linear Regression Models over Factorized Joins 2016 SIGMOD 0.00016135159
1,279 Towards Linear Algebra over Normalized Data 2017 VLDB 0.00012868394
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,940 SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging 2021 SIGMOD 0.00010020173
2,122 SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle 2020 CIDR 9.4989076e-05
2,350 An Intermediate Representation for Optimizing Machine Learning Pipelines 2019 VLDB 8.9788641e-05
2,791 Towards Demystifying Serverless Machine Learning Training 2021 SIGMOD 8.1206618e-05
2,848 Exploiting Matrix Dependency for Efficient Distributed Matrix Computation 2015 SIGMOD 8.0208832e-05
3,918 On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML 2018 VLDB 6.6315176e-05
3,948 A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics 2018 VLDB 6.5959084e-05
4,557 Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches 2021 VLDB 6.087611e-05
4,774 LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems 2021 SIGMOD 5.9316087e-05
4,802 Resource Elasticity for Large-Scale Machine Learning 2015 SIGMOD 5.9114415e-05
5,257 Probabilistic Demand Forecasting at Scale 2017 VLDB 5.6003925e-05
5,720 BAGUA: Scaling up Distributed Learning with System Relaxations 2022 VLDB 5.3527734e-05
6,373 DeepBase: Deep Inspection of Neural Networks 2019 SIGMOD 5.0929326e-05
6,538 Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent 2019 SIGMOD 5.023239e-05
6,986 A Cost-based Optimizer for Gradient Descent Optimization 2017 SIGMOD 4.8727048e-05
7,306 DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines 2022 CIDR 4.7678574e-05
8,092 Saga: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications 2023 SIGMOD 4.587921e-05
8,444 Not Black-Box Anymore! Enabling Analytics-Aware Optimizations in Teradata Vantage 2021 VLDB 4.5118994e-05
8,514 UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads 2022 VLDB 4.4944285e-05
9,001 The Power of Nested Parallelism in Big Data Processing – Hitting Three Flies with One Slap – 2021 SIGMOD 4.4107627e-05
9,222 Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning 2021 VLDB 4.3698672e-05
9,326 BladeDISC: Optimizing Dynamic Shape Machine Learning Workloads via Compiler Approach 2023 SIGMOD 4.3556432e-05
9,332 PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development 2018 SIGMOD 4.3556432e-05
10,571 Quantum Data Management in the NISQ Era 2025 VLDB 4.1945683e-05
10,745 Robust Recursive Query Parallelism in Graph Database Management Systems 2025 VLDB 4.1945683e-05
10,998 Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems 2024 VLDB 4.1945683e-05
11,472 Hybrid Evaluation for Distributed Iterative Matrix Computation 2021 SIGMOD 4.1945683e-05
11,859 dmapply: A functional primitive to express distributed machine learning algorithms in R 2016 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 16 of 16 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers