Approximate XML Joins
Summary: Proposes approximate XML joins to integrate heterogeneous sources via structural and content distance. Introduces cheap tree edit bounds, a join framework using multiple distance metrics, and reference sets with sampling-based discovery. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sudipto Guha
- 2. H. V. Jagadish
- 3. Nick Koudas
- 4. Divesh Srivastava
- 5. Ting Yu
Incoming Citations (Sorted by Pagerank)
Showing 10 of 10 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,589 | DogmatiX Tracks down Duplicates in XML | 2005 | SIGMOD | 8.4847146e-05 |
| 3,199 | Similarity Evaluation on Tree-structured Data | 2005 | SIGMOD | 7.3927291e-05 |
| 3,301 | RTED: A Robust Algorithm for the Tree Edit Distance | 2012 | VLDB | 7.2515266e-05 |
| 3,758 | Keyword Search over Relational Databases: A Metadata Approach | 2011 | SIGMOD | 6.7824746e-05 |
| 3,845 | On Repairing Structural Problems In Semi-structured Data | 2013 | VLDB | 6.7073366e-05 |
| 4,406 | Approximate Matching of Hierarchical Data Using pq-Grams | 2005 | VLDB | 6.2141638e-05 |
| 6,732 | An Incrementally Maintainable Index for Approximate Lookups in Hierarchical Data | 2006 | VLDB | 4.9477058e-05 |
| 7,215 | SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins | 2023 | VLDB | 4.7985991e-05 |
| 10,706 | Extensible and Robust Evaluation of Similarity Queries | 2025 | VLDB | 4.1945683e-05 |
| 11,013 | X-TED: Massive Parallelization of Tree Edit Distance | 2024 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 33 | BIRCH: An Efficient Data Clustering Method for Very Large Databases | 1996 | SIGMOD | 0.00077324389 |
| 91 | M-tree: An Efficient Access Method for Similarity Search in Metric Spaces | 1997 | VLDB | 0.0005181666 |
| 125 | Approximate String Joins in a Database (Almost) for Free | 2001 | VLDB | 0.00044847972 |
| 341 | CURE: An Efficient Clustering Algorithm for Large Databases | 1998 | SIGMOD | 0.00026810548 |
| 728 | Meaningful Change Detection in Structured Data | 1997 | SIGMOD | 0.00017494982 |
| 1,390 | Change Detection in Hierarchically Structured Information | 1996 | SIGMOD | 0.00012248349 |
| 1,822 | Change-Centric Management of Versions in an XML Warehouse | 2001 | VLDB | 0.00010420264 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,652 | Information Preserving XML Schema Embedding | 2005 | VLDB | 4.9761854e-05 |
| 9,430 | Approximate Joins: Concepts and Techniques | 2005 | VLDB | 4.3441378e-05 |
| 3,120 | Holistic Twig Joins on Indexed XML Documents | 2003 | VLDB | 7.5295938e-05 |
| 9,594 | Fast Optimal Twig Joins | 2010 | VLDB | 4.3197044e-05 |
| 3,199 | Similarity Evaluation on Tree-structured Data | 2005 | SIGMOD | 7.3927291e-05 |
| 11,703 | Worst Case Optimal Joins on Relational and XML data | 2018 | SIGMOD | 4.1945683e-05 |
| 3,419 | Approximate XML Query Answers | 2004 | SIGMOD | 7.1173416e-05 |
| 1,733 | Efficient Structural Joins on Indexed XML Documents | 2002 | VLDB | 0.00010724888 |
| 6,241 | Scaling Similarity Joins over Tree-Structured Data | 2015 | VLDB | 5.1411469e-05 |
| 5,273 | Correlating XML Data Streams Using Tree-Edit Distance Embeddings | 2003 | PODS | 5.5913399e-05 |