Database Paper Browser

Back to papers

Data Integration: The Teenage Years

Summary: Review of a decade of data integration since Information Manifold; advocates LAV for heterogeneous, evolving sources. Summarizes source completeness, binding/access restrictions, and query tradeoffs; proposes scalable, precise cross-source integration. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
9501
Venue
VLDB
Year
2006
Pagerank
0.00015558352
Overall Rank
893 | 93.79%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 26 of 26 citing papers.

Rank Citing Paper Year Venue Pagerank
627 Management of Probabilistic Data: Foundations and Challenges 2007 PODS 0.00018959005
721 Data Integration with Uncertainty 2007 VLDB 0.00017570539
1,722 Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach 2007 VLDB 0.00010757784
1,833 Data Wrangling: The Challenging Journey from the Wild to the Lake 2015 CIDR 0.00010378976
2,209 Data Integration: After the Teenage Years 2017 PODS 9.2868035e-05
2,452 Data Fusion – Resolving Data Conflicts for Integration 2009 VLDB 8.7839322e-05
2,771 A Relational Approach to Incrementally Extracting and Querying Structure in Unstructured Data 2007 VLDB 8.1421432e-05
3,937 On Reconciling Data Exchange, Data Integration, and Peer Data Management 2007 PODS 6.6159574e-05
3,995 How Large Language Models Will Disrupt Data Management 2023 VLDB 6.5513237e-05
4,435 Sampling Dirty Data for Matching Attributes 2010 SIGMOD 6.1918164e-05
4,508 iTrails: Pay-as-you-go Information Integration in Dataspaces 2007 VLDB 6.1298098e-05
4,558 Managing Structured Collections of Community Data 2011 CIDR 6.0869516e-05
4,752 Normalization and Optimization of Schema Mappings 2009 VLDB 5.9481448e-05
5,548 Foundations of Uncertain-Data Integration 2010 VLDB 5.4446854e-05
5,717 Query Processing under GLAV Mappings for Relational and Graph Databases 2013 VLDB 5.3553228e-05
5,796 Finding Frequent Items in Probabilistic Data 2008 SIGMOD 5.3240234e-05
6,233 Mosaic: A Sample-Based Database System for Open World Query Processing 2020 CIDR 5.1451876e-05
6,355 User Feedback as a First Class Citizen in Information Integration Systems 2011 CIDR 5.0987661e-05
7,229 Sailing the Information Ocean with Awareness of Currents: Discovery and Application of Source Dependence 2009 CIDR 4.7950172e-05
7,941 Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds 2021 SIGMOD 4.613363e-05
9,044 Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data 2023 VLDB 4.4039656e-05
9,137 Combating Fake News: A Data Management and Mining Perspective 2019 VLDB 4.3881065e-05
10,090 Integrating Vector Databases across Embedding Models 2026 SIGMOD 4.1945683e-05
11,386 Rewriting the Infinite Chase 2022 VLDB 4.1945683e-05
11,858 RDF Graph Alignment with Bisimulation 2016 VLDB 4.1945683e-05
12,478 Randomized Algorithms for Data Reconciliation in Wide Area Aggregate Query Processing 2007 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 50 of 57 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
48 Data Integration: A Theoretical Perspective 2002 PODS 0.00069720859
101 ULDBs: Databases with Uncertainty and Lineage 2006 VLDB 0.0004955674
115 Eddies: Continuously Adaptive Query Processing 2000 SIGMOD 0.00046221215
127 Querying Heterogeneous Information Sources Using Source Descriptions 1996 VLDB 0.00044642203
138 Query Transformation for PSJ-queries 1987 VLDB 0.00042334092
149 Trio: A System for Integrated Management of Data, Accuracy, and Lineage 2005 CIDR 0.00041101118
151 Optimizing Queries across Diverse Data Sources 1997 VLDB 0.00041016476
173 Schema Mapping as Query Discovery 2000 VLDB 0.00038627829
188 Applying Model Management to Classical Meta Data Problems 2003 CIDR 0.00035968389
199 Declarative Data Cleaning: Language, Model, and Algorithms 2001 VLDB 0.00035041015
208 Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach 2001 SIGMOD 0.0003460594
220 Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans 1998 SIGMOD 0.00033194808
229 Reference Reconciliation in Complex Information Spaces 2005 SIGMOD 0.00032242633
291 Answering Queries Using Templates With Binding Patterns (Extended Abstract) 1995 PODS 0.00028831632
294 Using Schema Matching to Simplify Heterogeneous Data Translation 1998 VLDB 0.00028669519
297 Complexity of Answering Queries Using Materialized Views 1998 PODS 0.00028596715
303 Generic Schema Matching with Cupid 2001 VLDB 0.00028301477
339 Optimization of Dynamic Query Evaluation Plans 1994 SIGMOD 0.00026851113
382 COMA - A system for flexible combination of schema matching approaches 2002 VLDB 0.00024823252
394 An Adaptive Query Execution System for Data Integration* 1999 SIGMOD 0.00024460855
416 Computing Queries from Derived Relations 1985 VLDB 0.0002380776
456 Cost-based Query Scrambling for Initial Delays 1998 SIGMOD 0.00022717134
474 XQuery: A Query Language for XML 2003 SIGMOD 0.00022322907
532 Answering Recursive Queries Using Views 1997 PODS 0.00020778506
578 The GMAP: A Versatile Tool for Physical Data Independence 1994 VLDB 0.00019838707
621 Schema Mappings, Data Exchange, and Metadata Management 2005 PODS 0.00019005115
822 Composing Schema Mappings: Second-Order Dependencies to the Rescue 2004 PODS 0.00016255689
839 Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues 2003 SIGMOD 0.00016079422
870 Query Optimization in the Presence of Limited Access Patterns 1999 SIGMOD 0.00015771912
873 Constraint-Based XML Query Rewriting for Data Integration 2004 SIGMOD 0.00015752865
879 Composing Mappings Among Data Sources 2003 VLDB 0.00015674595
902 Statistical Schema Matching across Web Query Interfaces 2003 SIGMOD 0.00015486247
916 On Schema Matching with Opaque Column Names and Data Values 2003 SIGMOD 0.00015379422
976 Answering Queries Using Limited External Query Processors 1996 PODS 0.0001489085
1,058 SIMS: Retrieving and Integrating Information From Multiple Sources 1993 SIGMOD 0.00014387611
1,065 Data-Driven Understanding and Refinement of Schema Mappings 2001 SIGMOD 0.00014338146
1,155 A Scalable Algorithm for Answering Queries Using Views 2000 VLDB 0.00013616518
1,245 Answering XML Queries over Heterogeneous Data Sources 2001 VLDB 0.00013080995
1,252 Principles of Dataspace Systems 2006 PODS 0.00013033186
1,600 The Architecture of PIER: an Internet-Scale Query Processor 2005 CIDR 0.00011201407
1,693 Merging Models Based on Given Correspondences 2003 VLDB 0.00010900382
1,742 Composition of Mappings Given by Embedded Dependencies 2005 PODS 0.00010708408
1,923 Reconciling while Tolerating Disagreement in Collaborative Data Sharing 2006 SIGMOD 0.00010080761
2,114 Rondo: A Programming Platform for Generic Model Management 2003 SIGMOD 9.5268855e-05
2,303 Parallel evaluation of multi-join queries 1995 SIGMOD 9.066178e-05
2,327 Obtaining Complete Answers from Incomplete Databases 1996 VLDB 9.0276061e-05
2,333 A Platform for Personal Information Management and Integration 2005 CIDR 9.0169986e-05
2,399 Query Rewriting for Semistructured Data 1999 SIGMOD 8.8973689e-05
2,536 Rewriting Queries Using Views in Description Logics 1997 PODS 8.5837937e-05
2,590 Answering Queries from Statistics and Probabilistic Views 2005 VLDB 8.483194e-05
Previous Page 1 / 2 Next

Semantically Similar Papers

Overall Rank Paper Year Venue Pagerank
2,493 A Formal View Integration Method 1986 SIGMOD 8.6513163e-05
4,860 Relationship Merging in Schema Integration 1984 VLDB 5.8716872e-05
816 A Methodology for View Integration in Logical Database Design 1982 VLDB 0.00016353797
3,724 Toward Large Scale Integration: Building a MetaQuerier over Databases on the Web 2005 CIDR 6.8173288e-05
12,694 A Case-Based Approach to Information Integration 2000 VLDB 4.1945683e-05
1,321 Intelligent Integration of Information 1993 SIGMOD 0.0001261238
9,551 Data Integration and Data Exchange: It’s Really About Time 2013 CIDR 4.3255102e-05
12,852 Data Integration in the Large: The Challenge of Reuse 1994 VLDB 4.1945683e-05
48 Data Integration: A Theoretical Perspective 2002 PODS 0.00069720859
2,209 Data Integration: After the Teenage Years 2017 PODS 9.2868035e-05