Back to papers
Pattern Functional Dependencies for Data Cleaning
Summary: Proposes pattern functional dependencies (PFDs) linking partial values via regex, surpassing traditional ICs. Armstrong-like axioms, analysis, and a scalable PFD discovery algorithm; experiments on 15 real datasets show PFDs detect errors missed by ICs.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12266
- Venue
- VLDB
- Year
- 2020
- Pagerank
- 5.6375087e-05
- Overall Rank
- 5,192 | 63.89%
- DOI
-
10.14778/3377369.3377377
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 13 of 13 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 2,349 |
RPT: Relational Pre-trained Transformer Is Almost All You Need towards Democratizing Data Preparation |
2021 |
VLDB |
8.9876423e-05 |
| 6,280 |
Self-supervised and Interpretable Data Cleaning with Sequence Generative Adversarial Networks |
2023 |
VLDB |
5.1290457e-05 |
| 7,202 |
Conformance Constraint Discovery: Measuring Trust in Data-Driven Systems |
2021 |
SIGMOD |
4.8023314e-05 |
| 8,743 |
CtxPipe: Context-aware Data Preparation Pipeline Construction for Machine Learning |
2024 |
SIGMOD |
4.456315e-05 |
| 9,348 |
GIDCL: A Graph-Enhanced Interpretable Data Cleaning Framework with Large Language Models |
2024 |
SIGMOD |
4.3526427e-05 |
| 9,389 |
DataVinci: Learning Syntactic and Semantic String Repairs |
2025 |
SIGMOD |
4.3441378e-05 |
| 9,749 |
Efficient Differential Dependency Discovery |
2024 |
VLDB |
4.2897489e-05 |
| 9,963 |
Parallel Rule Discovery from Large Datasets by Sampling |
2022 |
SIGMOD |
4.2294678e-05 |
| 10,216 |
The Case For Language Model Approximated LIKE Predicate |
2026 |
SIGMOD |
4.1945683e-05 |
| 11,010 |
Mixed Covers of Keys and Functional Dependencies for Maintaining the Integrity of Data under Updates |
2024 |
VLDB |
4.1945683e-05 |
| 11,054 |
Enriching Relations with Additional Attributes for ER |
2024 |
VLDB |
4.1945683e-05 |
| 11,070 |
Efficient Validation of SHACL Shapes with Reasoning |
2024 |
VLDB |
4.1945683e-05 |
| 11,223 |
Splitting Tuples of Mismatched Entities |
2023 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 214 |
Scorpion: Explaining Away Outliers in Aggregate Queries |
2013 |
VLDB |
0.0003363692 |
| 555 |
Discovering Denial Constraints |
2013 |
VLDB |
0.00020254908 |
| 712 |
Magellan: Toward Building Entity Matching Management Systems |
2016 |
VLDB |
0.00017732426 |
| 1,012 |
NADEEF: A Commodity Data Cleaning System |
2013 |
SIGMOD |
0.0001464733 |
| 1,047 |
Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms |
2015 |
VLDB |
0.00014459715 |
| 1,469 |
BlinkFill: Semi-supervised Programming By Example for Syntactic String Transformations |
2016 |
VLDB |
0.00011836053 |
| 1,612 |
Detecting Data Errors: Where are we and what needs to be done? |
2016 |
VLDB |
0.00011142794 |
| 1,625 |
Data Profiling with Metanome |
2015 |
VLDB |
0.00011094926 |
| 1,831 |
Synthesizing Entity Matching Rules by Examples |
2018 |
VLDB |
0.00010384082 |
| 2,038 |
The return of JedAI: End-to-End Entity Resolution for Structured and Semi-Structured Data |
2018 |
VLDB |
9.7098952e-05 |
| 2,506 |
Auto-Detect: Data-Driven Error Detection in Tables |
2018 |
SIGMOD |
8.6335464e-05 |
| 2,574 |
Discovery of Genuine Functional Dependencies from Relational Data with Missing Values |
2018 |
VLDB |
8.5173637e-05 |
| 3,192 |
Towards Dependable Data Repairing with Fixing Rules |
2014 |
SIGMOD |
7.4095761e-05 |
| 4,744 |
Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization |
2017 |
VLDB |
5.957936e-05 |
| 4,904 |
Temporal Rules Discovery for Web Data Cleaning |
2016 |
VLDB |
5.8399195e-05 |
| 5,205 |
ANMAT: Automatic Knowledge Discovery and Error Detection through Pattern Functional Dependencies |
2019 |
SIGMOD |
5.630869e-05 |
| 6,111 |
Why Big Data Industrial Systems Need Rules and What We Can Do About It |
2015 |
SIGMOD |
5.2049579e-05 |
Semantically Similar Papers