Selecting Data to Clean for Fact Checking: Minimizing Uncertainty vs. Maximizing Surprise
Summary: Choose data to clean for fact-checking; optimize either lowering uncertainty in claim metrics or boosting counterargument probability. Efficient algorithms handle non-linear objectives; results extend to broad function classes, with bias risks in selective cleaning. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Stavros Sintos
- 2. Pankaj K. Agarwal
- 3. Jun Yang
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,465 | To Intervene or Not To Intervene: Cost based Intervention for Combating Fake News | 2021 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 192 | HoloClean: Holistic Data Repairs with Probabilistic Inference | 2017 | VLDB | 0.00035728858 |
| 450 | The Theory Of Probabilistic Databases | 1987 | VLDB | 0.00022822073 |
| 791 | ActiveClean: Interactive Data Cleaning For Statistical Modeling | 2016 | VLDB | 0.00016629664 |
| 855 | Integrating Conflicting Data: The Role of Source Dependence | 2009 | VLDB | 0.00015906735 |
| 1,012 | NADEEF: A Commodity Data Cleaning System | 2013 | SIGMOD | 0.0001464733 |
| 2,184 | A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data | 2014 | SIGMOD | 9.3429789e-05 |
| 3,340 | Toward Computational Fact-Checking | 2014 | VLDB | 7.2030091e-05 |
| 3,773 | Cleaning Crowdsourced Labels Using Oracles for Statistical Classification | 2019 | VLDB | 6.7758649e-05 |
| 5,537 | Cleaning Uncertain Data with Quality Guarantees | 2008 | VLDB | 5.4522327e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,648 | User Guidance for Efficient Fact Checking | 2019 | VLDB | 4.6889787e-05 |
| 10,740 | Finding Convincing Views to Endorse a Claim | 2025 | VLDB | 4.1945683e-05 |
| 2,184 | A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data | 2014 | SIGMOD | 9.3429789e-05 |
| 2,018 | Statistical Distortion: Consequences of Data Cleaning | 2012 | VLDB | 9.7764643e-05 |
| 5,412 | Mining an "Anti-Knowledge Base" from Wikipedia Updates with Applications to Fact Checking and Beyond | 2020 | VLDB | 5.5207515e-05 |
| 5,537 | Cleaning Uncertain Data with Quality Guarantees | 2008 | VLDB | 5.4522327e-05 |
| 9,043 | Query-Guided Resolution in Uncertain Databases | 2023 | SIGMOD | 4.4039656e-05 |
| 1,627 | Data Cleaning: Overview and Emerging Challenges | 2016 | SIGMOD | 0.00011086905 |
| 7,029 | Computational Fact Checking: A Content Management Perspective | 2018 | VLDB | 4.8563777e-05 |
| 3,340 | Toward Computational Fact-Checking | 2014 | VLDB | 7.2030091e-05 |