Database Paper Browser

Back to papers

ERACER: A Database Approach for Statistical Inference and Data Cleaning

Summary: ERACER is a framework that infers missing values and cleans errors via belief propagation on relational networks, implementable in SQL/UDFs. Uses shrinkage to cleanse dirty data and handles cyclic dependencies, achieving Bayesian accuracy with approximate inference on synthetic data. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4234
Venue
SIGMOD
Year
2010
Pagerank
0.00018588729
Overall Rank
656 | 95.44%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 32 of 32 citing papers.

Rank Citing Paper Year Venue Pagerank
192 HoloClean: Holistic Data Repairs with Probabilistic Inference 2017 VLDB 0.00035728858
517 Can Foundation Models Wrangle Your Data? 2023 VLDB 0.00021169035
881 Don’t be SCAREd: Use SCalable Automatic REpairing with Maximal Likelihood and Bounded Changes 2013 SIGMOD 0.00015661103
1,012 NADEEF: A Commodity Data Cleaning System 2013 SIGMOD 0.0001464733
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,546 KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing 2015 SIGMOD 0.00011446851
1,627 Data Cleaning: Overview and Emerging Challenges 2016 SIGMOD 0.00011086905
2,018 Statistical Distortion: Consequences of Data Cleaning 2012 VLDB 9.7764643e-05
2,276 Mind the Gap: An Experimental Evaluation of Imputation of Missing Values Techniques in Time Series 2020 VLDB 9.1261944e-05
2,587 Table-GPT: Table Fine-tuned GPT for Diverse Table Tasks 2024 SIGMOD 8.4924618e-05
2,823 Interaction between Record Matching and Data Repairing 2011 SIGMOD 8.0593894e-05
2,946 BigDansing: A System for Big Data Cleansing 2015 SIGMOD 7.8372441e-05
3,192 Towards Dependable Data Repairing with Fixing Rules 2014 SIGMOD 7.4095761e-05
3,299 SCODED: Statistical Constraint Oriented Data Error Detection 2020 SIGMOD 7.2546659e-05
4,102 GoodCore: Data-effective and Data-efficient Machine Learning through Coreset Selection over Incomplete Data 2023 SIGMOD 6.4522929e-05
4,332 Missing Value Imputation on Multidimensional Time Series 2021 VLDB 6.2805243e-05
5,002 Sequential Data Cleaning: A Statistical Approach 2016 SIGMOD 5.7671075e-05
5,215 Relational Approach for Shortest Path Discovery over Large Graphs 2012 VLDB 5.6228603e-05
5,253 Enriching Data Imputation with Extensive Similarity Neighbors 2015 VLDB 5.6014916e-05
5,729 KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing 2015 VLDB 5.3506368e-05
7,407 Intermittent Query Processing 2019 VLDB 4.7373205e-05
7,634 ReStore - Neural Data Completion for Relational Databases 2021 SIGMOD 4.6911382e-05
7,766 ICARUS: Minimizing Human Effort in Iterative Data Completion 2018 VLDB 4.6564959e-05
8,005 Online Topic-Aware Entity Resolution Over Incomplete Data Streams 2021 SIGMOD 4.6081461e-05
8,092 Saga: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications 2023 SIGMOD 4.587921e-05
8,593 Wisteria: Nurturing Scalable Data Cleaning Infrastructure 2015 VLDB 4.4891474e-05
9,301 Repairing Data through Regular Expressions 2016 VLDB 4.3587281e-05
9,849 Reptile: Aggregation-level Explanations for Hierarchical Data 2022 SIGMOD 4.2721228e-05
9,924 On Saving Outliers for Better Clustering over Noisy Data 2021 SIGMOD 4.2544238e-05
10,026 Minimum Change ≠ Best Cleaning: Parallel and Incremental Error Detection under Integrity Constraints 2026 SIGMOD 4.1945683e-05
11,050 Win-Win: On Simultaneous Clustering and Imputing over Incomplete Data 2024 VLDB 4.1945683e-05
11,536 LOCATER: Cleaning WiFi Connectivity Datasets for Semantic Localization 2021 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers