Database Paper Browser

Back to papers

Learning Over Dirty Data Without Cleaning

Summary: DLearn learns directly from dirty data without cleaning, bypassing data-repair bottlenecks. It leverages database constraints to infer relational models that summarize patterns across all plausible clean versions; empirical evaluation on large real-world datasets shows accuracy and efficiency. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5923
Venue
SIGMOD
Year
2020
Pagerank
4.6320452e-05
Overall Rank
7,867 | 45.28%
DOI
10.1145/3318464.3389708

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 20 of 20 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
49 Consistent Query Answers in Inconsistent Databases 1999 PODS 0.00067660624
67 The Merge/Purge Problem for Large Databases 1995 SIGMOD 0.00061348205
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
199 Declarative Data Cleaning: Language, Model, and Algorithms 2001 VLDB 0.00035041015
265 A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification 2005 SIGMOD 0.00029763412
560 Dependencies Revisited for Improving Data Quality 2008 PODS 0.00020141923
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
702 Reasoning about Record Matching Rules 2009 VLDB 0.00017918203
791 ActiveClean: Interactive Data Cleaning For Statistical Modeling 2016 VLDB 0.00016629664
833 Guided Data Repair 2011 VLDB 0.00016138432
1,188 On Generating Near-Optimal Tableaux for Conditional Functional Dependencies 2008 VLDB 0.00013441729
1,459 Query From Examples: An Iterative, Data-Driven Approach to Query Construction 2015 VLDB 0.00011889802
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
2,750 Learning and Verifying Quantified Boolean Queries by Example 2013 PODS 8.176296e-05
2,982 FastQRE: Fast Query Reverse Engineering 2018 SIGMOD 7.7801984e-05
3,345 QuickFOIL: Scalable Inductive Logic Programming 2015 VLDB 7.1958815e-05
5,235 Industry-Scale Duplicate Detection 2008 VLDB 5.6115647e-05
6,042 MDedup: Duplicate Detection with Matching Dependencies 2020 VLDB 5.2405269e-05
7,664 Schema Independent Relational Learning 2017 SIGMOD 4.6857329e-05
8,637 Machine Learning for Data Management: Problems and Solutions 2018 SIGMOD 4.479892e-05
Previous Page 1 / 1 Next

Semantically Similar Papers