Database Paper Browser

Back to papers

Parallel Discrepancy Detection and Incremental Detection

Summary: Parallel discrepancy detection for duplicates, mismatches, and conflicts via ML-embed rules, unifying ER and conflict handling. Parallel and incremental algorithms offer scalability; NP-hard, W[1]-hard; bounded incremental; validated on real/synthetic data. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12324
Venue
VLDB
Year
2021
Pagerank
4.9621556e-05
Overall Rank
6,690 | 53.46%
DOI
10.14778/3457390.3457400

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 15 of 15 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 34 of 34 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
49 Consistent Query Answers in Inconsistent Databases 1999 PODS 0.00067660624
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
229 Reference Reconciliation in Complex Information Spaces 2005 SIGMOD 0.00032242633
280 Eliminating Fuzzy Duplicates in Data Warehouses 2002 VLDB 0.00029113044
300 Deep Learning for Entity Matching: A Design Space Exploration 2018 SIGMOD 0.00028441466
319 Evaluation of entity resolution approaches on real-world match problems 2010 VLDB 0.00027781866
502 Worst-case Optimal Join Algorithms 2012 PODS 0.00021526612
509 On Active Learning of Record Matching Packages 2010 SIGMOD 0.00021409518
623 Improving Data Quality: Consistency and Accuracy 2007 VLDB 0.00018996374
702 Reasoning about Record Matching Rules 2009 VLDB 0.00017918203
754 Distributed Representations of Tuples for Entity Resolution 2018 VLDB 0.00017117211
1,159 Towards Certain Fixes with Editing Rules and Master Data 2010 VLDB 0.00013592813
1,188 On Generating Near-Optimal Tableaux for Conditional Functional Dependencies 2008 VLDB 0.00013441729
1,337 HoloDetect: Few-Shot Learning for Error Detection 2019 SIGMOD 0.00012497164
1,411 Communication Steps for Parallel Query Processing 2013 PODS 0.0001212565
2,175 Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services 2017 SIGMOD 9.3644117e-05
2,212 Skew in Parallel Query Processing 2014 PODS 9.2771827e-05
2,231 Dedoop: Efficient Deduplication with Hadoop 2012 VLDB 9.2304499e-05
2,450 Functional Dependencies for Graphs 2016 SIGMOD 8.7882979e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
2,968 Raha: A Configuration-Free Error Detection System 2019 SIGMOD 7.7985097e-05
3,141 ClusterJoin: A Similarity Joins Framework using Map-Reduce 2014 VLDB 7.4829448e-05
3,299 SCODED: Statistical Constraint Oriented Data Error Detection 2020 SIGMOD 7.2546659e-05
3,394 Incremental Graph Computations: Doable and Undoable 2017 SIGMOD 7.1480446e-05
3,528 Distributed Data Deduplication 2016 VLDB 7.0066139e-05
3,645 Large-Scale Collective Entity Matching 2011 VLDB 6.8853274e-05
3,694 Keys for Graphs 2015 VLDB 6.8345712e-05
3,773 Cleaning Crowdsourced Labels Using Oracles for Statistical Classification 2019 VLDB 6.7758649e-05
3,977 BLAST: a Loosely Schema-aware Meta-blocking Approach for Entity Resolution 2016 VLDB 6.5736268e-05
6,703 Discovering Graph Functional Dependencies 2018 SIGMOD 4.9555163e-05
6,810 Record Linkage with Uniqueness Constraints and Erroneous Values 2010 VLDB 4.9203397e-05
7,559 Strongly Truthful Interactive Regret Minimization 2019 SIGMOD 4.7107487e-05
9,564 Catching Numeric Inconsistencies in Graphs 2018 SIGMOD 4.3254416e-05
Previous Page 1 / 1 Next

Semantically Similar Papers