DogmatiX Tracks down Duplicates in XML
Summary: DogmatiX extends duplicate detection to XML with a general framework (candidate, duplicate, detection). It uses XML-aware similarity that blends value similarity with structural cues from parents and children, plus XML-specific heuristics, validated empirically. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Melanie Weis
- 2. Felix Naumann
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 67 | The Merge/Purge Problem for Large Databases | 1995 | SIGMOD | 0.00061348205 |
| 112 | Potter's Wheel: An Interactive Data Cleaning System | 2001 | VLDB | 0.00047045036 |
| 199 | Declarative Data Cleaning: Language, Model, and Algorithms | 2001 | VLDB | 0.00035041015 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |
| 2,784 | Approximate XML Joins | 2002 | SIGMOD | 8.128931e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,174 | RRXS: Redundancy reducing XML storage in relations | 2003 | VLDB | 4.3838473e-05 |
| 992 | XTRACT: A System for Extracting Document Type Descriptors from XML Documents | 2000 | SIGMOD | 0.00014799689 |
| 936 | Framework for Evaluating Clustering Algorithms in Duplicate Detection | 2009 | VLDB | 0.0001521549 |
| 2,784 | Approximate XML Joins | 2002 | SIGMOD | 8.128931e-05 |
| 3,360 | Modeling and Querying Possible Repairs in Duplicate Detection | 2009 | VLDB | 7.1742067e-05 |
| 12,545 | A Framework for Processing Complex Document-centric XML with Overlapping Structures | 2005 | SIGMOD | 4.1945683e-05 |
| 280 | Eliminating Fuzzy Duplicates in Data Warehouses | 2002 | VLDB | 0.00029113044 |
| 6,042 | MDedup: Duplicate Detection with Matching Dependencies | 2020 | VLDB | 5.2405269e-05 |
| 7,056 | Efficient Discovery of XML Data Redundancies | 2006 | VLDB | 4.8492432e-05 |
| 5,235 | Industry-Scale Duplicate Detection | 2008 | VLDB | 5.6115647e-05 |