Data Auditor: Exploring Data Quality and Semantics using Pattern Tableaux
Summary: Data Auditor uses pattern tableaux to summarize relation subsets that mostly satisfy or fail a constraint. Details architecture and UI, constraint support, tuning heuristics, and real-data demonstrations for exploring data quality and semantics. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Lukasz Golab
- 2. Howard Karloff
- 3. Flip Korn
- 4. Divesh Srivastava
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,158 | Uni-Detect: A Unified Approach to Automated Error Detection in Tables | 2019 | SIGMOD | 9.4141354e-05 |
| 3,105 | Data X-Ray: A Diagnostic Tool for Data Errors | 2015 | SIGMOD | 7.5568954e-05 |
| 3,467 | Data Profiling – A Tutorial | 2017 | SIGMOD | 7.069081e-05 |
| 5,096 | Auto-Transform: Learning-to-Transform by Patterns | 2020 | VLDB | 5.7011825e-05 |
| 5,445 | QFix: Diagnosing Errors through Query Histories | 2017 | SIGMOD | 5.5020909e-05 |
| 6,280 | Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks | 2023 | VLDB | 5.1290457e-05 |
| 6,475 | Explain3D: Explaining Disagreements in Disjoint Datasets | 2019 | VLDB | 5.0497183e-05 |
| 6,817 | Error Diagnosis and Data Profiling with Data X-Ray | 2015 | VLDB | 4.9171711e-05 |
| 7,838 | Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes | 2021 | SIGMOD | 4.6377995e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 112 | Potter's Wheel: An Interactive Data Cleaning System | 2001 | VLDB | 0.00047045036 |
| 199 | Declarative Data Cleaning: Language, Model, and Algorithms | 2001 | VLDB | 0.00035041015 |
| 475 | Mining Database Structure; Or, How to Build a Data Quality Browser | 2002 | SIGMOD | 0.00022303253 |
| 1,188 | On Generating Near-Optimal Tableaux for Conditional Functional Dependencies | 2008 | VLDB | 0.00013441729 |
| 1,401 | Extending Dependencies with Conditions | 2007 | VLDB | 0.00012187775 |
| 2,159 | Sequential Dependencies | 2009 | VLDB | 9.4130956e-05 |
| 5,803 | Semandaq: A Data Quality System Based on Conditional Functional Dependencies | 2008 | VLDB | 5.3205861e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,280 | CM-Explorer: Dissecting Data Ingestion Problems | 2023 | VLDB | 4.1945683e-05 |
| 6,510 | Efficient Auditing For Complex SQL queries | 2011 | SIGMOD | 5.0326078e-05 |
| 10,019 | Guardrail: Automated Integrity Constraint Synthesis From Noisy Data | 2026 | SIGMOD | 4.1945683e-05 |
| 8,475 | DataProf: Semantic Profiling for Iterative Data Cleansing and Business Rule Acquisition | 2018 | SIGMOD | 4.5028904e-05 |
| 1,482 | Automating Large-Scale Data Quality Verification | 2018 | VLDB | 0.00011725533 |
| 623 | Improving Data Quality: Consistency and Accuracy | 2007 | VLDB | 0.00018996374 |
| 10,512 | Auto-Test: Learning Semantic-Domain Constraints for Unsupervised Error Detection in Tables | 2025 | SIGMOD | 4.1945683e-05 |
| 732 | Discovering Data Quality Rules | 2008 | VLDB | 0.00017465093 |
| 12,624 | Systematic Development of Data Mining-Based Data Quality Tools | 2003 | VLDB | 4.1945683e-05 |
| 11,579 | AUDITOR: A System Designed for Automatic Discovery of Complex Integrity Constraints in Relational Databases | 2020 | SIGMOD | 4.1945683e-05 |