Database Paper Browser

Back to papers

Dedoop: Efficient Deduplication with Hadoop

Summary: Dedoop offers browser-based specification of complex ER workflows (blocking, similarity functions, ML-generated classifiers) for MapReduce-based deduplication on Hadoop. It auto-translates workflows into MapReduce jobs, visualizes results and workload, and uses blocking-aware load balancing to minimize comparisons and balance cluster usage. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10425
Venue
VLDB
Year
2012
Pagerank
9.2304499e-05
Overall Rank
2,231 | 84.49%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 14 of 14 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 1 of 1 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
319 Evaluation of entity resolution approaches on real-world match problems 2010 VLDB 0.00027781866
Previous Page 1 / 1 Next

Semantically Similar Papers