Back to papers
Distributed Numerical and Machine Learning Computations via Two-Phase Execution of Aggregated Join Trees
Summary: Two-phase execution for numerical/ML workloads expressed as aggregated join trees (joins then aggregation). Pilot run collects lineage to enable record-level planning before execution; experiments show this relational two-phase approach as an effective platform for distributed ML.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12313
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 4.2992942e-05
- Overall Rank
- 9,706 | 32.48%
- DOI
-
10.14778/3450980.3450991
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 1 |
Access Path Selection in a Relational Database Management System |
1979 |
SIGMOD |
0.0040449103 |
| 7 |
Optimal Aggregation Algorithms for Middleware [Extended Abstract] |
2001 |
PODS |
0.0015496097 |
| 217 |
Ripple Joins for Online Aggregation |
1999 |
SIGMOD |
0.00033536712 |
| 299 |
Trio: A System for Data, Uncertainty, and Lineage |
2006 |
VLDB |
0.00028525071 |
| 317 |
Distributed Query Processing In A Relational Data Base System |
1978 |
SIGMOD |
0.00027980992 |
| 454 |
An Overview of Query Optimization in Relational Systems |
1998 |
PODS |
0.00022734812 |
| 807 |
Exploiting Inter-Operation Parallelism in XPRS |
1992 |
SIGMOD |
0.00016434207 |
| 2,280 |
SMOKE: Fine-grained Lineage at Interactive Speed |
2018 |
VLDB |
9.1111033e-05 |
| 2,526 |
Track Join: Distributed Joins with Minimal Network Traffic |
2014 |
SIGMOD |
8.5968612e-05 |
| 3,099 |
DB4ML – An In-Memory Database Kernel with Machine Learning Support |
2020 |
SIGMOD |
7.5642871e-05 |
| 3,958 |
MLog: Towards Declarative In-Database Machine Learning |
2017 |
VLDB |
6.5897636e-05 |
| 4,045 |
Optimizing Nested Queries with Parameter Sort Orders |
2005 |
VLDB |
6.4985218e-05 |
| 5,014 |
Dynamically Optimizing Queries over Large Scale Data Platforms |
2014 |
SIGMOD |
5.7586174e-05 |
| 5,821 |
Tensor Relational Algebra for Distributed Machine Learning System Design |
2021 |
VLDB |
5.3134851e-05 |
| 6,745 |
DistME: A Fast and Elastic Distributed Matrix Computation Engine using GPUs |
2019 |
SIGMOD |
4.9417155e-05 |
| 8,997 |
Chasing Similarity: Distribution-aware Aggregation Scheduling |
2019 |
VLDB |
4.4120041e-05 |
| 9,332 |
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development |
2018 |
SIGMOD |
4.3556432e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 2,640 |
Design and Evaluation of Parallel Pipelined Join Algorithms |
1987 |
SIGMOD |
8.3924401e-05 |
| 4,409 |
Declarative Recursive Computation on an RDBMS |
2019 |
VLDB |
6.2104034e-05 |
| 4,288 |
Parallel Processing of Recursive Queries in Distributed Architectures |
1989 |
VLDB |
6.2891396e-05 |
| 8,781 |
Accelerate Distributed Joins with Predicate Transfer |
2025 |
SIGMOD |
4.4534753e-05 |
| 6,191 |
Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra |
2021 |
SIGMOD |
5.1642282e-05 |
| 5,821 |
Tensor Relational Algebra for Distributed Machine Learning System Design |
2021 |
VLDB |
5.3134851e-05 |
| 1,939 |
From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System |
2015 |
SIGMOD |
0.00010025655 |
| 11,890 |
Let's Rethink Join Optimization in Distributed Systems |
2015 |
CIDR |
4.1945683e-05 |
| 9,581 |
Sharing Aggregate Computation for Distributed Queries |
2007 |
SIGMOD |
4.3227214e-05 |
| 4,132 |
Advanced Join Strategies for Large-Scale Distributed Computation |
2014 |
VLDB |
6.4241067e-05 |