Duplicate Removal in Information Dissemination
Summary: Proposes a per-user, per-document Duplicate Removal Module (DRM) for SIFT to curb duplication in large-scale information dissemination. Evaluates multiple algorithms and data structures, quantifying memory, throughput, and latency to identify the best implementation and its costs. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,714 | An Efficient Query Indexing Mechanism for Filtering Geo-Textual Data | 2013 | SIGMOD | 6.8223298e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 616 | Copy Detection Mechanisms for Digital Documents | 1995 | SIGMOD | 0.00019108201 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,235 | Industry-Scale Duplicate Detection | 2008 | VLDB | 5.6115647e-05 |
| 9,065 | Adaptive Information System Design: One Query at a Time | 1985 | SIGMOD | 4.4039656e-05 |
| 3,790 | An Efficient and Resilient Approach to Filtering and Disseminating Streaming Data | 2003 | VLDB | 6.7630518e-05 |
| 5,236 | Online Deduplication for Databases | 2017 | SIGMOD | 5.611324e-05 |
| 8,196 | Caching and Database Scaling in Distributed Shared-Nothing Information Retrieval Systems | 1993 | SIGMOD | 4.5611312e-05 |
| 3,360 | Modeling and Querying Possible Repairs in Duplicate Detection | 2009 | VLDB | 7.1742067e-05 |
| 3,838 | Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters | 2006 | SIGMOD | 6.7134945e-05 |
| 8,015 | Streaming Quotient Filter: A Near Optimal Approximate Duplicate Detection Approach for Data Streams | 2013 | VLDB | 4.6051162e-05 |
| 226 | Efficient Filtering of XML Documents for Selective Dissemination of Information | 2000 | VLDB | 0.00032431532 |
| 3,528 | Distributed Data Deduplication | 2016 | VLDB | 7.0066139e-05 |