PolyFrame: A Retargetable Query-based Approach to Scaling Dataframes
Summary: Retargets AFrame from AsterixDB to a backend-agnostic, query-based layer for scalable DataFrame analytics. Introduces PolyFrame, preserving Pandas API while incrementally shaping queries for diverse composable languages across DBMS backends. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,254 | Query Processing on Tensor Computation Runtimes | 2022 | VLDB | 7.3161051e-05 |
| 6,895 | Decentralized Actor Scheduling and Reference-based Storage in Xorbits: a Native Scalable Data Science Engine | 2025 | VLDB | 4.8925595e-05 |
| 9,911 | Dias: Dynamic Rewriting of Pandas Code | 2024 | SIGMOD | 4.2565279e-05 |
| 11,024 | SplitDF: Splitting Dataframes for Memory-Efficient Data Analysis | 2024 | VLDB | 4.1945683e-05 |
| 11,448 | Wisconsin Benchmark Data Generator: To JSON and Beyond | 2021 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 6 of 6 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |
| 1,438 | AsterixDB: A Scalable, Open Source BDMS | 2014 | VLDB | 0.00011973592 |
| 2,954 | Magpie: Python at Speed and Scale using Cloud Backends | 2021 | CIDR | 7.8262582e-05 |
| 3,535 | Scaling Spark in the Real World: Performance and Usability | 2015 | VLDB | 6.9992495e-05 |
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,416 | When sweet and cute isn't enough anymore: Solving scalability issues in Python Pandas with Grizzly | 2020 | CIDR | 4.3441378e-05 |
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |
| 2,954 | Magpie: Python at Speed and Scale using Cloud Backends | 2021 | CIDR | 7.8262582e-05 |
| 7,413 | On Scale Independence for Querying Big Data | 2014 | PODS | 4.7358047e-05 |
| 6,541 | ConnectorX: Accelerating Data Loading From Databases to Dataframes | 2022 | VLDB | 5.0216945e-05 |
| 3,763 | Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System | 2022 | VLDB | 6.7801795e-05 |
| 8,915 | DQDF: Data-Quality-Aware Dataframes | 2022 | VLDB | 4.427232e-05 |
| 9,361 | An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB | 2019 | VLDB | 4.3506168e-05 |
| 7,794 | Large-scale Complex Analytics on Semi-structured Datasets using AsterixDB and Spark | 2016 | VLDB | 4.6482977e-05 |
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |