Crowd-Based Deduplication: An Adaptive Approach
Summary: ACD adapts correlation clustering to crowd-based deduplication, with techniques to speed crowd work and postprocess for higher accuracy. MTurk experiments show higher precision than state-of-the-art with moderate crowdsourcing overhead. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sibo Wang
- 2. Xiaokui Xiao
- 3. Chun-Hee Lee
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 19 of 19 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,788 | CDB: Optimizing Queries with Crowd-Based Selections and Joins | 2017 | SIGMOD | 4.1945683e-05 |
| 4,416 | CrowdMatcher: Crowd-Assisted Schema Matching | 2014 | SIGMOD | 6.2039225e-05 |
| 263 | CrowdER: Crowdsourcing Entity Resolution | 2012 | VLDB | 0.00029862413 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |
| 6,042 | MDedup: Duplicate Detection with Matching Dependencies | 2020 | VLDB | 5.2405269e-05 |
| 908 | Fusing Data with Correlations | 2014 | SIGMOD | 0.00015431241 |
| 3,360 | Modeling and Querying Possible Repairs in Duplicate Detection | 2009 | VLDB | 7.1742067e-05 |
| 936 | Framework for Evaluating Clustering Algorithms in Duplicate Detection | 2009 | VLDB | 0.0001521549 |
| 3,528 | Distributed Data Deduplication | 2016 | VLDB | 7.0066139e-05 |
| 2,386 | Leveraging Aggregate Constraints For Deduplication | 2007 | SIGMOD | 8.9231895e-05 |