DQDF: Data-Quality-Aware Dataframes
Summary: DQDF embeds data-quality checks directly into Python dataframes, removing separate QC state maintenance. Automatic metadata-change detection and per-check context reuse accelerate QC on evolving data, delivering 40–80% faster quality evaluation with <10% memory overhead. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Phanwadee Sinthong
- 2. Dhaval Patel
- 3. Nianjun Zhou
- 4. Shrey Shrivastava
- 5. Arun Iyengar
- 6. Anuradha Bhamidipaty
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,867 | T-Assess: An Efficient Data Quality Assessment System Tailored for Trajectory Data | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |
| 3,491 | TensorFlow Data Validation: Data Analysis and Validation in Continuous ML Pipelines | 2020 | SIGMOD | 7.0451276e-05 |
| 3,535 | Scaling Spark in the Real World: Performance and Usability | 2015 | VLDB | 6.9992495e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,650 | Query-Driven Learning for Next Generation Predictive Modeling & Analytics | 2019 | SIGMOD | 4.1945683e-05 |
| 1,012 | NADEEF: A Commodity Data Cleaning System | 2013 | SIGMOD | 0.0001464733 |
| 11,288 | To UDFs and Beyond: Demonstration of a Fully Decomposed Data Processor for General Data Wrangling Tasks | 2023 | VLDB | 4.1945683e-05 |
| 4,773 | PolyFrame: A Retargetable Query-based Approach to Scaling Dataframes | 2021 | VLDB | 5.9320139e-05 |
| 507 | Data Quality and Data Cleaning: An Overview | 2003 | SIGMOD | 0.00021473263 |
| 11,024 | SplitDF: Splitting Dataframes for Memory-Efficient Data Analysis | 2024 | VLDB | 4.1945683e-05 |
| 4,003 | Data Platform for Machine Learning | 2019 | SIGMOD | 6.54347e-05 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 1,427 | Towards Scalable Dataframe Systems | 2020 | VLDB | 0.0001204248 |
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |