Database Paper Browser

Back to papers

Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks

Summary: Garf: a SeqGAN-based, self-supervised framework that extracts interpretable conditional repair rules (e.g., city→county) directly from noisy tables. A generator plus two discriminators (D to learn dependencies, D' to iteratively refine rules/data) yields interpretable, high-accuracy cleaning without labeled data. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13300
Venue
VLDB
Year
2023
Pagerank
5.1290457e-05
Overall Rank
6,280 | 56.32%
DOI
10.14778/3570690.3570694

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 6 of 6 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 25 of 25 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
265 A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification 2005 SIGMOD 0.00029763412
513 TURL: Table Understanding through Representation Learning 2021 VLDB 0.00021288342
555 Discovering Denial Constraints 2013 VLDB 0.00020254908
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
833 Guided Data Repair 2011 VLDB 0.00016138432
881 Don’t be SCAREd: Use SCalable Automatic REpairing with Maximal Likelihood and Bounded Changes 2013 SIGMOD 0.00015661103
1,012 NADEEF: A Commodity Data Cleaning System 2013 SIGMOD 0.0001464733
1,047 Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms 2015 VLDB 0.00014459715
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,277 The Data Civilizer System 2017 CIDR 0.00012879695
1,401 Extending Dependencies with Conditions 2007 VLDB 0.00012187775
1,612 Detecting Data Errors: Where are we and what needs to be done? 2016 VLDB 0.00011142794
1,894 Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning 2020 VLDB 0.0001018378
2,349 RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation 2021 VLDB 8.9876423e-05
2,460 Combining Quantitative and Logical Data Cleaning 2016 VLDB 8.7617484e-05
2,968 Raha: A Configuration-Free Error Detection System 2019 SIGMOD 7.7985097e-05
3,192 Towards Dependable Data Repairing with Fixing Rules 2014 SIGMOD 7.4095761e-05
4,929 Data Auditor: Exploring Data Quality and Semantics using Pattern Tableaux 2010 VLDB 5.8217296e-05
5,153 Horizon: Scalable Dependency-driven Data Cleaning 2021 VLDB 5.6607963e-05
5,192 Pattern Functional Dependencies for Data Cleaning 2020 VLDB 5.6375087e-05
5,618 Explaining Repaired Data with CFDs 2018 VLDB 5.4079415e-05
5,729 KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing 2015 VLDB 5.3506368e-05
5,852 Repairing Vertex Labels under Neighborhood Constraints 2014 VLDB 5.3007132e-05
6,350 NADEEF: A Generalized Data Cleaning System 2013 VLDB 5.101815e-05
Previous Page 1 / 1 Next

Semantically Similar Papers