Back to papers
A Layered Aggregate Engine for Analytics Workloads
Summary: LMFAO is an in-memory, layered engine for batches of group-by aggregates over joins in analytics. It uses layered optimizations, sharing, and code specialization to accelerate ridge regression, trees, Chow-Liu networks, and data cubes, outperforming DBMSs and ML frameworks on these tasks.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5770
- Venue
- SIGMOD
- Year
- 2019
- Pagerank
- 7.2871625e-05
- Overall Rank
- 3,277 | 77.21%
- DOI
-
10.1145/3299869.3324961
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 22 of 22 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 4,725 |
GeCo: Quality Counterfactual Explanations in Real Time |
2021 |
VLDB |
5.9697637e-05 |
| 4,787 |
The Relational Data Borg is Learning |
2020 |
VLDB |
5.9224501e-05 |
| 5,955 |
LMFAO: An Engine for Batches of Group-By Aggregates |
2020 |
VLDB |
5.2572882e-05 |
| 6,156 |
Optimizing Tensor Programs on Flexible Storage |
2023 |
SIGMOD |
5.1802603e-05 |
| 6,247 |
Optimizing In-memory Database Engine for AI-powered On-line Decision Augmentation Using Persistent Memory |
2021 |
VLDB |
5.1389201e-05 |
| 6,673 |
Incorporating Super-Operators in Big-Data Query Optimizers |
2020 |
VLDB |
4.966799e-05 |
| 7,076 |
Mining Approximate Acyclic Schemes from Relations |
2020 |
SIGMOD |
4.8426354e-05 |
| 7,491 |
Saibot: A Differentially Private Data Search Platform |
2023 |
VLDB |
4.7180617e-05 |
| 7,714 |
Identifying Insufficient Data Coverage in Databases with Multiple Relations |
2020 |
VLDB |
4.6700455e-05 |
| 7,920 |
JoinBoost: Grow Trees Over Normalized Data Using Only SQL |
2023 |
VLDB |
4.6163888e-05 |
| 8,595 |
Towards A Polyglot Framework for Factorized ML |
2021 |
VLDB |
4.4889397e-05 |
| 8,680 |
A Practical Approach to Groupjoin and Nested Aggregates |
2021 |
VLDB |
4.4694927e-05 |
| 8,757 |
An Intermediate Representation for Hybrid Database and Machine Learning Workloads |
2021 |
VLDB |
4.456315e-05 |
| 8,786 |
AWARE: Workload-aware, Redundancy-exploiting Linear Algebra |
2023 |
SIGMOD |
4.4521262e-05 |
| 9,486 |
Quantifying the Loss of Acyclic Join Dependencies |
2023 |
PODS |
4.3341665e-05 |
| 9,856 |
In-Database Data Imputation |
2024 |
SIGMOD |
4.269353e-05 |
| 10,177 |
InferF: Declarative Factorization of AI/ML Inferences over Joins |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,406 |
GES: High-Performance Graph Processing Engine and Service in Huawei |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,551 |
Avoiding Materialisation for Guarded Aggregate Queries |
2025 |
VLDB |
4.1945683e-05 |
| 11,220 |
Lightweight Materialization for Fast Dashboards Over Joins |
2023 |
SIGMOD |
4.1945683e-05 |
| 11,282 |
Demonstration of OpenDBML, a Framework for Democratizing In-Database Machine Learning |
2023 |
VLDB |
4.1945683e-05 |
| 11,363 |
Givens QR Decomposition over Relational Databases |
2022 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 24 of 24 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 11 |
Implementing Data Cubes Efficiently |
1996 |
SIGMOD |
0.0011708144 |
| 60 |
Efficiently Compiling Efficient Query Plans for Modern Hardware |
2011 |
VLDB |
0.00064439773 |
| 140 |
The MADlib Analytics Library or MAD Skills, the SQL |
2012 |
VLDB |
0.00042270404 |
| 248 |
Eager Aggregation and Lazy Aggregation |
1995 |
VLDB |
0.00030785339 |
| 342 |
EmptyHeaded: A Relational Engine for Graph Processing |
2016 |
SIGMOD |
0.00026795977 |
| 557 |
SystemML: Declarative Machine Learning on Spark |
2016 |
VLDB |
0.00020197988 |
| 583 |
FAQ: Questions Asked Frequently |
2016 |
PODS |
0.00019717214 |
| 658 |
Towards a Unified Architecture for in-RDBMS Analytics |
2012 |
SIGMOD |
0.00018506577 |
| 659 |
The Making of TPC-DS |
2006 |
VLDB |
0.00018500853 |
| 834 |
Learning Linear Regression Models over Factorized Joins |
2016 |
SIGMOD |
0.00016135159 |
| 853 |
Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask |
2018 |
VLDB |
0.00015940507 |
| 903 |
To Join or Not to Join? Thinking Twice about Joins before Feature Selection |
2016 |
SIGMOD |
0.0001547016 |
| 962 |
Maintenance of Data Cubes and Summary Tables in a Warehouse |
1997 |
SIGMOD |
0.00014986226 |
| 1,167 |
Learning Generalized Linear Models Over Normalized Data |
2015 |
SIGMOD |
0.00013547713 |
| 1,259 |
Aggregation and Ordering in Factorised Databases |
2013 |
VLDB |
0.00012995821 |
| 1,279 |
Towards Linear Algebra over Normalized Data |
2017 |
VLDB |
0.00012868394 |
| 2,014 |
Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware |
2016 |
VLDB |
9.7904029e-05 |
| 2,383 |
How to Architect a Query Compiler |
2016 |
SIGMOD |
8.9294108e-05 |
| 2,838 |
How to Architect a Query Compiler, Revisited |
2018 |
SIGMOD |
8.0408472e-05 |
| 2,925 |
Shared Workload Optimization |
2014 |
VLDB |
7.888494e-05 |
| 3,082 |
FDB: A Query Engine for Factorised Relational Databases |
2012 |
VLDB |
7.6014248e-05 |
| 3,878 |
Data Canopy: Accelerating Exploratory Statistical Analysis |
2017 |
SIGMOD |
6.6731435e-05 |
| 4,505 |
SPOOF: Sum-Product Optimization and Operator Fusion for Large-Scale Machine Learning |
2017 |
CIDR |
6.1327108e-05 |
| 6,322 |
The BUDS Language for Distributed Bayesian Machine Learning |
2017 |
SIGMOD |
5.1124615e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 8,665 |
Advancing Fact Attribution for Query Answering: Aggregate Queries and Novel Algorithms |
2025 |
VLDB |
4.471975e-05 |
| 9,706 |
Distributed Numerical and Machine Learning Computations via Two-Phase Execution of Aggregated Join Trees |
2021 |
VLDB |
4.2992942e-05 |
| 9,587 |
Low Rank Learning for Offline Query Optimization |
2025 |
SIGMOD |
4.3215645e-05 |
| 9,222 |
Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning |
2021 |
VLDB |
4.3698672e-05 |
| 1,167 |
Learning Generalized Linear Models Over Normalized Data |
2015 |
SIGMOD |
0.00013547713 |
| 4,549 |
Database-Agnostic Workload Management |
2019 |
CIDR |
6.0926728e-05 |
| 6,330 |
Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse |
2018 |
VLDB |
5.1077416e-05 |
| 9,345 |
LIMAO: A Framework for Lifelong Modular Learned Query Optimization |
2025 |
VLDB |
4.3536343e-05 |
| 9,776 |
Structure-Aware Machine Learning over Multi-Relational Databases |
2021 |
SIGMOD |
4.2856106e-05 |
| 5,955 |
LMFAO: An Engine for Batches of Group-By Aggregates |
2020 |
VLDB |
5.2572882e-05 |