Discovering Linkage Points over Web Data
Summary: Instance-based linkage discovery over Web data identifies linkage points across sources, bypassing fixed-schema alignment. A library of lexical analyzers, similarity measures, and search strategies enables efficient cross-source linkage discovery. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Oktie Hassanzadeh
- 2. Ken Q. Pu
- 3. Soheil Hassas Yeganeh
- 4. Renee J. Miller
- 5. Lucian Popa
- 6. Mauricio A. Hernandez
- 7. Howard Ho
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,178 | Table Union Search on Open Data | 2018 | VLDB | 0.00013468118 |
| 2,730 | Open Data Integration | 2018 | VLDB | 8.2126735e-05 |
| 3,349 | Schema Management for Document Stores | 2015 | VLDB | 7.1903648e-05 |
| 3,735 | Auto-Join: Joining Tables by Leveraging Transformations | 2017 | VLDB | 6.8061318e-05 |
| 4,850 | SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora | 2015 | VLDB | 5.8768452e-05 |
| 5,656 | HYDRA: Large-scale Social Identity Linkage via Heterogeneous Behavior Modeling | 2014 | SIGMOD | 5.3866501e-05 |
| 5,789 | Interactive Navigation of Open Data Linkages | 2017 | VLDB | 5.3269741e-05 |
| 7,784 | Authenticated Online Data Integration Services | 2015 | SIGMOD | 4.6517065e-05 |
| 9,399 | TabulaX: Leveraging Large Language Models for Multi-Class Table Transformations | 2025 | VLDB | 4.3441378e-05 |
| 10,540 | Discovering Approximate Inclusion Dependencies | 2025 | VLDB | 4.1945683e-05 |
| 11,895 | Finding Quality in Quantity: The Challenge of Discovering Valuable Sources for Integration | 2015 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 48 | Data Integration: A Theoretical Perspective | 2002 | PODS | 0.00069720859 |
| 916 | On Schema Matching with Opaque Column Names and Data Values | 2003 | SIGMOD | 0.00015379422 |
| 1,664 | On Multi-Column Foreign Key Discovery | 2010 | VLDB | 0.00010976887 |
| 2,174 | iMAP: Discovering Complex Semantic Matches between Database Schemas | 2004 | SIGMOD | 9.3672342e-05 |
| 3,328 | Multi-column Substring Matching for Database Schema Translation | 2006 | VLDB | 7.2174278e-05 |
| 3,744 | Learning Expressive Linkage Rules using Genetic Programming | 2012 | VLDB | 6.7932071e-05 |
| 12,321 | Linkage Query Writer | 2009 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,345 | Entity Matching: How Similar Is Similar | 2011 | VLDB | 0.00012468408 |
| 322 | Record Linkage: Similarity Measures and Algorithms | 2006 | SIGMOD | 0.00027518768 |
| 672 | An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web | 2004 | SIGMOD | 0.00018355746 |
| 3,631 | On-the-Fly Entity-Aware Query Processing in the Presence of Linkage | 2010 | VLDB | 6.9014378e-05 |
| 4,342 | LinkClus: Efficient Clustering via Heterogeneous Semantic Links | 2006 | VLDB | 6.2758722e-05 |
| 4,383 | Incremental Record Linkage | 2014 | VLDB | 6.2383094e-05 |
| 11,706 | Big Data Linkage for Product Specification Pages | 2018 | SIGMOD | 4.1945683e-05 |
| 6,792 | Automatically Incorporating New Sources in Keyword Search-Based Data Integration | 2010 | SIGMOD | 4.9249098e-05 |
| 8,549 | LinkDB: A Probabilistic Linkage Database System | 2011 | SIGMOD | 4.4937074e-05 |
| 902 | Statistical Schema Matching across Web Query Interfaces | 2003 | SIGMOD | 0.00015486247 |