Database Paper Browser

Back to papers

SystemML: Declarative Machine Learning on Spark

Summary: Declarative ML via SystemML's DSL for linear algebra lets data scientists express custom algorithms while Spark uses cost-based plans. End-to-end Spark integration yields in-memory and scalable plans; open-source with optimizer/runtime insights for research. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
11251
Venue
VLDB
Year
2016
Pagerank
0.00020197988
Overall Rank
557 | 96.13%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 64 citing papers.

Rank Citing Paper Year Venue Pagerank
1,279 Towards Linear Algebra over Normalized Data 2017 VLDB 0.00012868394
1,391 Ease.ml: Towards Multi-tenant Resource Sharing for Machine Learning Workloads 2018 VLDB 0.0001223506
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,891 Towards Model-based Pricing for Machine Learning in a Data Marketplace 2019 SIGMOD 0.00010194092
1,940 SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging 2021 SIGMOD 0.00010020173
2,122 SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle 2020 CIDR 9.4989076e-05
2,163 Elastic Machine Learning Algorithms in Amazon SageMaker 2020 SIGMOD 9.3949234e-05
2,194 Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra 2019 SIGMOD 9.3138337e-05
2,350 An Intermediate Representation for Optimizing Machine Learning Pipelines 2019 VLDB 8.9788641e-05
2,753 Complaint-driven Training Data Debugging for Query 2.0 2020 SIGMOD 8.1724339e-05
2,804 Extending Relational Query Processing with ML Inference 2020 CIDR 8.0935487e-05
3,099 DB4ML – An In-Memory Database Kernel with Machine Learning Support 2020 SIGMOD 7.5642871e-05
3,215 Fractal: A General-Purpose Graph Pattern Mining System 2019 SIGMOD 7.3645742e-05
3,254 Query Processing on Tensor Computation Runtimes 2022 VLDB 7.3161051e-05
3,265 RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! - 2018 VLDB 7.3083672e-05
3,277 A Layered Aggregate Engine for Analytics Workloads 2019 SIGMOD 7.2871625e-05
3,407 End-to-end Optimization of Machine Learning Prediction Queries 2022 SIGMOD 7.1295646e-05
3,473 AI Meets Database: AI4DB and DB4AI 2021 SIGMOD 7.062864e-05
3,918 On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML 2018 VLDB 6.6315176e-05
3,948 A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics 2018 VLDB 6.5959084e-05
4,003 Data Platform for Machine Learning 2019 SIGMOD 6.54347e-05
4,129 Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers? 2018 VLDB 6.428887e-05
4,505 SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning 2017 CIDR 6.1327108e-05
4,557 Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches 2021 VLDB 6.087611e-05
4,677 Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications 2018 SIGMOD 6.0047822e-05
4,774 LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems 2021 SIGMOD 5.9316087e-05
4,833 MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions 2019 SIGMOD 5.8916346e-05
4,964 PS2: Parameter Server on Spark 2019 SIGMOD 5.7965988e-05
5,720 BAGUA: Scaling up Distributed Learning with System Relaxations 2022 VLDB 5.3527734e-05
5,806 BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees 2019 SIGMOD 5.3200643e-05
6,191 Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra 2021 SIGMOD 5.1642282e-05
6,745 DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs 2019 SIGMOD 4.9417155e-05
6,796 InferDB: In-Database Machine Learning Inference Using Indexes 2024 VLDB 4.9241624e-05
6,986 A Cost-based Optimizer for Gradient Descent Optimization 2017 SIGMOD 4.8727048e-05
7,061 Serving Deep Learning Models with Deduplication from Relational Databases 2022 VLDB 4.8463881e-05
7,179 Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning 2023 VLDB 4.8078895e-05
7,311 The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development 2020 SIGMOD 4.7656884e-05
7,476 Lachesis: Automatic Partitioning for UDF-Centric Analytics 2021 VLDB 4.7188928e-05
7,892 M2Bench: A Database Benchmark for Multi-Model Analytic Workloads 2023 VLDB 4.6245179e-05
7,920 JoinBoost: Grow Trees Over Normalized Data Using Only SQL 2023 VLDB 4.6163888e-05
8,262 FuseME: Distributed Matrix Computation Engine based on Cuboid-based Fused Operator and Plan Generation 2022 SIGMOD 4.5467867e-05
8,279 Galley: Modern Query Optimization for Sparse Tensor Programs 2025 SIGMOD 4.5435639e-05
8,514 UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads 2022 VLDB 4.4944285e-05
8,620 PreVision: An Out-of-Core Matrix Computation System with Optimal Buffer Replacement 2024 SIGMOD 4.4837361e-05
8,786 AWARE: Workload-aware, Redundancy-exploiting Linear Algebra 2023 SIGMOD 4.4521262e-05
8,921 Leveraging Similarity Joins for Signal Reconstruction 2018 VLDB 4.427232e-05
8,980 HADAD: A Lightweight Approach for Optimizing Hybrid Complex Analytics Queries 2021 SIGMOD 4.4169807e-05
9,001 The Power of Nested Parallelism in Big Data Processing – Hitting Three Flies with One Slap – 2021 SIGMOD 4.4107627e-05
9,222 Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning 2021 VLDB 4.3698672e-05
9,332 PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development 2018 SIGMOD 4.3556432e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 12 of 12 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers