Messing Up with BART: Error Generation for Evaluating Data-Cleaning Algorithms
Summary: Error-generation for benchmarking data-cleaning; enables precise control and large-scale testing. NP-complete; a scalable greedy algorithm sacrifices completeness, aided by a symmetry property of data-quality constraints, exposing the control–scalability trade-off. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 23 of 23 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 265 | A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification | 2005 | SIGMOD | 0.00029763412 |
| 1,612 | Detecting Data Errors: Where are we and what needs to be done? | 2016 | VLDB | 0.00011142794 |
| 2,946 | BigDansing: A System for Big Data Cleansing | 2015 | SIGMOD | 7.8372441e-05 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 3,396 | Automatic Data Repair: Are We Ready to Deploy? | 2024 | VLDB | 7.1455126e-05 |
| 623 | Improving Data Quality: Consistency and Accuracy | 2007 | VLDB | 0.00018996374 |
| 2,823 | Interaction between Record Matching and Data Repairing | 2011 | SIGMOD | 8.0593894e-05 |
| 5,660 | Descriptive and Prescriptive Data Cleaning | 2014 | SIGMOD | 5.3847321e-05 |
| 10,026 | Minimum Change ≠ Best Cleaning: Parallel and Incremental Error Detection under Integrity Constraints | 2026 | SIGMOD | 4.1945683e-05 |
| 11,841 | BART in Action: Error Generation and Empirical Evaluations of Data-Cleaning Systems | 2016 | SIGMOD | 4.1945683e-05 |