Database Paper Browser

Back to papers

PIClean: A Probabilistic and Interactive Data Cleaning System

Summary: PIClean is a probabilistic, interactive data cleaning system that uses low-rank approximation to uncover cross-column relationships for joint error detection and repair. User feedback confirms or rejects probabilistic fixes, continually updating models to improve accuracy and coverage. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5722
Venue
SIGMOD
Year
2019
Pagerank
4.7093702e-05
Overall Rank
7,564 | 47.38%
DOI
10.1145/3299869.3320214

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 3 of 3 citing papers.

Rank Citing Paper Year Venue Pagerank
6,187 Semi-Supervised Data Cleaning with Raha and Baran 2021 CIDR 5.1656857e-05
7,223 Akane: Perplexity-Guided Time Series Data Cleaning 2024 SIGMOD 4.7965857e-05
10,211 SHoTClean: Bridging Soft and Hard Constraints for Multivariate Time Series Cleaning 2026 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
112 Potter's Wheel: An Interactive Data Cleaning System 2001 VLDB 0.00047045036
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
732 Discovering Data Quality Rules 2008 VLDB 0.00017465093
1,612 Detecting Data Errors: Where are we and what needs to be done? 2016 VLDB 0.00011142794
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
Previous Page 1 / 1 Next

Semantically Similar Papers