Back to papers
Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine
Summary: Rust-based, Arrow-backed embeddable analytic query engine with a fast, modular design. Open architecture with 10+ extension APIs delivers competitive performance against DuckDB and enables bespoke data infrastructures across databases, ML pipelines, and OLAP workloads.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6776
- Venue
- SIGMOD
- Year
- 2024
- Pagerank
- 5.1051018e-05
- Overall Rank
- 6,340 | 55.90%
- DOI
-
10.1145/3626246.3653368
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 8,034 |
Instance-Optimal Acyclic Join Processing Without Regret: Engineering the Yannakakis Algorithm in Column Stores |
2025 |
VLDB |
4.6010599e-05 |
| 8,118 |
Maximus: A Modular Accelerated Query Engine for Data Analytics on Heterogeneous Systems |
2025 |
SIGMOD |
4.5814829e-05 |
| 9,901 |
AnyBlox: A Framework for Self-Decoding Datasets |
2025 |
VLDB |
4.258022e-05 |
| 10,121 |
TQEx: Tensor-based Query Engine Enhanced by Bridging the Gap |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,248 |
Active Data Lakes: Regaining Physical Data Independence Without Losing Interoperability |
2026 |
VLDB |
4.1945683e-05 |
| 10,372 |
Data Chunk Compaction in Vectorized Execution |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,404 |
Dynamic Pruning for Recursive Joins |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,497 |
PilotDB: Database-Agnostic Online Approximate Query Processing with A Priori Error Guarantees |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,714 |
Towards Designing Future-Proof Data Processing Systems |
2025 |
VLDB |
4.1945683e-05 |
| 10,756 |
Selective Late Materialization in Modern Analytical Databases |
2025 |
VLDB |
4.1945683e-05 |
| 10,767 |
The HANA Native Query Engine for Lakehouse Systems |
2025 |
VLDB |
4.1945683e-05 |
| 10,854 |
LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics |
2025 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,059 |
Adaptive and Robust Query Execution for Lakehouses at Scale |
2024 |
VLDB |
4.8477825e-05 |
| 544 |
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources |
2018 |
SIGMOD |
0.00020521965 |
| 1,750 |
Weld: A Common Runtime for High Performance Data Analytics |
2017 |
CIDR |
0.00010683647 |
| 10,491 |
Intra-Query Runtime Elasticity for Cloud-Native Data Analysis |
2025 |
SIGMOD |
4.1945683e-05 |
| 7,306 |
DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines |
2022 |
CIDR |
4.7678574e-05 |
| 4,495 |
ClickHouse - Lightning Fast Analytics for Everyone |
2024 |
VLDB |
6.1410277e-05 |
| 1,864 |
Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last |
2018 |
VLDB |
0.00010280966 |
| 5,338 |
Fast In-Memory SQL Analytics on Typed Graphs |
2017 |
VLDB |
5.5629772e-05 |
| 7,328 |
BOSS - An Architecture for Database Kernel Composition |
2024 |
VLDB |
4.7610909e-05 |
| 5,562 |
A Deep Dive into Common Open Formats for Analytical DBMSs |
2023 |
VLDB |
5.4331334e-05 |