Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System
Summary: Modin translates pandas functions into a core operator set, parallelized with columnar, row-wise, and cell-wise decomposition rules. Metadata independence decouples order and type from physical layout, enabling lazy maintenance and true cross-operator row/column pandas support on large dataframes, outperforming pandas and Koalas/Dask. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,307 | A Critique of Modern SQL And A Proposal Towards A Simple and Expressive Query Language | 2024 | CIDR | 5.5766594e-05 |
| 6,895 | Decentralized Actor Scheduling and Reference-based Storage in Xorbits: a Native Scalable Data Science Engine | 2025 | VLDB | 4.8925595e-05 |
| 8,514 | UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads | 2022 | VLDB | 4.4944285e-05 |
| 9,762 | QURE: AI-Assisted and Automatically Verified UDF Inlining | 2025 | SIGMOD | 4.2856106e-05 |
| 9,911 | Dias: Dynamic Rewriting of Pandas Code | 2024 | SIGMOD | 4.2565279e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,911 | Dias: Dynamic Rewriting of Pandas Code | 2024 | SIGMOD | 4.2565279e-05 |
| 2,848 | Exploiting Matrix Dependency for Efficient Distributed Matrix Computation | 2015 | SIGMOD | 8.0208832e-05 |
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |
| 8,078 | Meta-Dataflows: Efficient Exploratory Dataflow Jobs | 2018 | SIGMOD | 4.5914967e-05 |
| 1,750 | Weld: A Common Runtime for High Performance Data Analytics | 2017 | CIDR | 0.00010683647 |
| 8,915 | DQDF: Data-Quality-Aware Dataframes | 2022 | VLDB | 4.427232e-05 |
| 11,024 | SplitDF: Splitting Dataframes for Memory-Efficient Data Analysis | 2024 | VLDB | 4.1945683e-05 |
| 4,773 | PolyFrame: A Retargetable Query-based Approach to Scaling Dataframes | 2021 | VLDB | 5.9320139e-05 |
| 8,094 | Modularis: Modular Relational Analytics over Heterogeneous Distributed Platforms | 2021 | VLDB | 4.5867812e-05 |
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |