When sweet and cute isn't enough anymore: Solving scalability issues in Python Pandas with Grizzly
Summary: Grizzly addresses Pandas' scalability limits by compiling Pandas DataFrame pipelines into SQL/SparkSQL and executing them in a DBMS to leverage optimized storage and query engines. Retains a Pandas-friendly API while dramatically reducing memory and CPU overhead. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |
| 11,024 | SplitDF: Splitting Dataframes for Memory-Efficient Data Analysis | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,922 | Selecting Subexpressions to Materialize at Datacenter Scale | 2018 | VLDB | 0.00010082599 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,782 | The Best of Both Worlds: Big Data Programming with Both Productivity and Performance | 2017 | SIGMOD | 4.1945683e-05 |
| 4,238 | Panda: Performance Debugging for Databases using LLM Agents | 2024 | CIDR | 6.331901e-05 |
| 11,123 | PD-Explain: A Unified Python-native Framework for Query Explanations Over DataFrames | 2024 | VLDB | 4.1945683e-05 |
| 3,763 | Flexible Rule-Based Decomposition and Metadata Independence in Modin: A Parallel Dataframe System | 2022 | VLDB | 6.7801795e-05 |
| 6,189 | Accelerating Python UDFs in Vectorized Query Execution | 2022 | CIDR | 5.1647573e-05 |
| 6,648 | Grizzly: Efficient Stream Processing Through Adaptive Query Compilation | 2020 | SIGMOD | 4.9771723e-05 |
| 4,773 | PolyFrame: A Retargetable Query-based Approach to Scaling Dataframes | 2021 | VLDB | 5.9320139e-05 |
| 2,954 | Magpie: Python at Speed and Scale using Cloud Backends | 2021 | CIDR | 7.8262582e-05 |
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |
| 4,813 | Putting Pandas in a Box | 2021 | CIDR | 5.9049746e-05 |