Back to papers
Uni-Detect: A Unified Approach to Automated Error Detection in Tables
Summary: UniDetect: unified error detection for tables using what-if perturbations and cross-table hypothesis tests, no per-dataset tuning required. It finds FD violations, numeric outliers, spelling mistakes, and outperforms specialized methods.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5667
- Venue
- SIGMOD
- Year
- 2019
- Pagerank
- 9.4141354e-05
- Overall Rank
- 2,158 | 84.99%
- DOI
-
10.1145/3299869.3319855
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 16 of 16 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 1,894 |
Baran: Effective Error Correction via a Unified Context Representation and Transfer Learning |
2020 |
VLDB |
0.0001018378 |
| 2,587 |
Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks |
2024 |
SIGMOD |
8.4924618e-05 |
| 3,252 |
Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks |
2020 |
SIGMOD |
7.3178277e-05 |
| 3,299 |
SCODED: Statistical Constraint Oriented Data Error Detection |
2020 |
SIGMOD |
7.2546659e-05 |
| 3,396 |
Automatic Data Repair: Are We Ready to Deploy? |
2024 |
VLDB |
7.1455126e-05 |
| 4,110 |
Learning to Validate the Predictions of Black Box Classifiers on Unseen Data |
2020 |
SIGMOD |
6.4428544e-05 |
| 5,096 |
Auto-Transform: Learning-to-Transform by Patterns |
2020 |
VLDB |
5.7011825e-05 |
| 6,187 |
Semi-Supervised Data Cleaning with Raha and Baran |
2021 |
CIDR |
5.1656857e-05 |
| 7,391 |
Time Series Data Validity |
2023 |
SIGMOD |
4.7429293e-05 |
| 7,838 |
Auto-Validate: Unsupervised Data Validation Using Data-Domain Patterns Inferred from Data Lakes |
2021 |
SIGMOD |
4.6377995e-05 |
| 9,348 |
GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models |
2024 |
SIGMOD |
4.3526427e-05 |
| 9,389 |
DataVinci: Learning Syntactic and Semantic String Repairs |
2025 |
SIGMOD |
4.3441378e-05 |
| 10,019 |
Guardrail: Automated Integrity Constraint Synthesis From Noisy Data |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,598 |
Auto-Prep: Holistic Prediction of Data Preparation Steps for Self-Service Business Intelligence |
2025 |
VLDB |
4.1945683e-05 |
| 10,821 |
Demonstrating Matelda for Multi-Table Error Detection |
2025 |
VLDB |
4.1945683e-05 |
| 11,504 |
LES3: Learning-based Exact Set Similarity Search |
2021 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 27 of 27 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 107 |
WebTables: Exploring the Power of Tables on the Web |
2008 |
VLDB |
0.00048377684 |
| 112 |
Potter's Wheel: An Interactive Data Cleaning System |
2001 |
VLDB |
0.00047045036 |
| 161 |
LOF: Identifying Density-Based Local Outliers |
2000 |
SIGMOD |
0.00039846974 |
| 192 |
HoloClean: Holistic Data Repairs with Probabilistic Inference |
2017 |
VLDB |
0.00035728858 |
| 224 |
CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies |
2004 |
SIGMOD |
0.00032746205 |
| 265 |
A Cost-Based Model and Effective Heuristic for Repairing Constraints by Value Modification |
2005 |
SIGMOD |
0.00029763412 |
| 475 |
Mining Database Structure; Or, How to Build a Data Quality Browser |
2002 |
SIGMOD |
0.00022303253 |
| 555 |
Discovering Denial Constraints |
2013 |
VLDB |
0.00020254908 |
| 623 |
Improving Data Quality: Consistency and Accuracy |
2007 |
VLDB |
0.00018996374 |
| 702 |
Reasoning about Record Matching Rules |
2009 |
VLDB |
0.00017918203 |
| 732 |
Discovering Data Quality Rules |
2008 |
VLDB |
0.00017465093 |
| 774 |
Algorithms for Mining Distance-Based Outliers in Large Datasets |
1998 |
VLDB |
0.00016865771 |
| 881 |
Don’t be SCAREd: Use SCalable Automatic REpairing with Maximal Likelihood and Bounded Changes |
2013 |
SIGMOD |
0.00015661103 |
| 1,469 |
BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations |
2016 |
VLDB |
0.00011836053 |
| 1,482 |
Automating Large-Scale Data Quality Verification |
2018 |
VLDB |
0.00011725533 |
| 1,546 |
KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing |
2015 |
SIGMOD |
0.00011446851 |
| 1,612 |
Detecting Data Errors: Where are we and what needs to be done? |
2016 |
VLDB |
0.00011142794 |
| 1,625 |
Data Profiling with Metanome |
2015 |
VLDB |
0.00011094926 |
| 2,506 |
Auto-Detect: Data-Driven Error Detection in Tables |
2018 |
SIGMOD |
8.6335464e-05 |
| 2,574 |
Discovery of Genuine Functional Dependencies from Relational Data with Missing Values |
2018 |
VLDB |
8.5173637e-05 |
| 2,734 |
Controlling False Discoveries During Interactive Data Exploration |
2017 |
SIGMOD |
8.2078306e-05 |
| 3,478 |
Transform-Data-by-Example (TDE): An Extensible Search Engine for Data Transformations |
2018 |
VLDB |
7.054159e-05 |
| 3,735 |
Auto-Join: Joining Tables by Leveraging Transformations |
2017 |
VLDB |
6.8061318e-05 |
| 3,742 |
TEGRA: Table Extraction by Global Record Alignment |
2015 |
SIGMOD |
6.7966898e-05 |
| 4,850 |
SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora |
2015 |
VLDB |
5.8768452e-05 |
| 4,929 |
Data Auditor: Exploring Data Quality and Semantics using Pattern Tableaux |
2010 |
VLDB |
5.8217296e-05 |
| 6,416 |
Synthesizing Type-Detection Logic for Rich Semantic Data Types using Open-source Code |
2018 |
SIGMOD |
5.072267e-05 |
Semantically Similar Papers