Back to papers
SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning
Summary: Introduces Spoof, an automatic framework that unifies algebraic simplification (sum-product) rewrites and operator fusion/codegen for ML DAGs to exploit linear-algebra properties and sparsity. Produces fused kernels with performance close to hand-tuned code and modest compile overhead.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 298
- Venue
- CIDR
- Year
- 2017
- Pagerank
- 6.1327108e-05
- Overall Rank
- 4,505 | 68.67%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,279 |
Towards Linear Algebra over Normalized Data |
2017 |
VLDB |
0.00012868394 |
| 1,532 |
Data Management in Machine Learning: Challenges, Techniques, and Systems |
2017 |
SIGMOD |
0.00011472681 |
| 2,194 |
Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra |
2019 |
SIGMOD |
9.3138337e-05 |
| 2,350 |
An Intermediate Representation for Optimizing Machine Learning Pipelines |
2019 |
VLDB |
8.9788641e-05 |
| 3,277 |
A Layered Aggregate Engine for Analytics Workloads |
2019 |
SIGMOD |
7.2871625e-05 |
| 3,918 |
On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML |
2018 |
VLDB |
6.6315176e-05 |
| 4,774 |
LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems |
2021 |
SIGMOD |
5.9316087e-05 |
| 4,833 |
MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions |
2019 |
SIGMOD |
5.8916346e-05 |
| 5,487 |
SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra |
2020 |
VLDB |
5.4791501e-05 |
| 8,262 |
FuseME: Distributed Matrix Computation Engine based on Cuboid-based Fused Operator and Plan Generation |
2022 |
SIGMOD |
4.5467867e-05 |
| 8,514 |
UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads |
2022 |
VLDB |
4.4944285e-05 |
| 11,339 |
Redundancy Elimination in Distributed Matrix Computation |
2022 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 23 of 23 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 51 |
Including Group-By in Query Optimization |
1994 |
VLDB |
0.00067123727 |
| 60 |
Efficiently Compiling Efficient Query Plans for Modern Hardware |
2011 |
VLDB |
0.00064439773 |
| 248 |
Eager Aggregation and Lazy Aggregation |
1995 |
VLDB |
0.00030785339 |
| 557 |
SystemML: Declarative Machine Learning on Spark |
2016 |
VLDB |
0.00020197988 |
| 583 |
FAQ: Questions Asked Frequently |
2016 |
PODS |
0.00019717214 |
| 658 |
Towards a Unified Architecture for in-RDBMS Analytics |
2012 |
SIGMOD |
0.00018506577 |
| 704 |
Building Efficient Query Engines in a High-Level Language |
2014 |
VLDB |
0.00017900583 |
| 834 |
Learning Linear Regression Models over Factorized Joins |
2016 |
SIGMOD |
0.00016135159 |
| 850 |
Scaling Factorization Machines to Relational Data |
2013 |
VLDB |
0.00015955971 |
| 1,076 |
RIOT: I/O-Efficient Numerical Computing without SQL |
2009 |
CIDR |
0.00014248449 |
| 1,167 |
Learning Generalized Linear Models Over Normalized Data |
2015 |
SIGMOD |
0.00013547713 |
| 1,259 |
Aggregation and Ordering in Factorised Databases |
2013 |
VLDB |
0.00012995821 |
| 1,263 |
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation |
2016 |
SIGMOD |
0.00012982857 |
| 1,873 |
An Architecture for Compiling UDF-centric Workflows |
2015 |
VLDB |
0.00010253002 |
| 1,967 |
Compressed Linear Algebra for Large-Scale Machine Learning |
2016 |
VLDB |
9.9131712e-05 |
| 2,383 |
How to Architect a Query Compiler |
2016 |
SIGMOD |
8.9294108e-05 |
| 2,667 |
Cumulon: Optimizing Statistical Data Analysis in the Cloud |
2013 |
SIGMOD |
8.3413995e-05 |
| 4,326 |
Fast Queries Over Heterogeneous Data Through Engine Customization |
2016 |
VLDB |
6.288323e-05 |
| 4,397 |
Estimating Compilation Time of a Query Optimizer |
2003 |
SIGMOD |
6.2230918e-05 |
| 4,802 |
Resource Elasticity for Large-Scale Machine Learning |
2015 |
SIGMOD |
5.9114415e-05 |
| 6,542 |
Profiling R on a Contemporary Processor |
2015 |
VLDB |
5.0216639e-05 |
| 7,823 |
Measuring and Optimizing Distributed Array Programs |
2016 |
VLDB |
4.6419393e-05 |
| 7,878 |
DBToaster: Agile Views in a Dynamic Data Management System |
2011 |
CIDR |
4.6295401e-05 |
Semantically Similar Papers