BibFinder/StatMiner: Effectively Mining and Using Coverage and Overlap Statistics in Data Integration
Summary: StatMiner learns coverage and overlap for data integration via a hierarchical classifier and thresholded learning to adapt resolution. Demonstrates on BibFinder, autonomous sources with uneven coverage, learned stats speeding query processing. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,985 | Online Ordering of Overlapping Data Sources | 2014 | VLDB | 4.1945683e-05 |
| 12,178 | Large-Scale Copy Detection | 2011 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 36 | Fast Algorithms for Mining Association Rules | 1994 | VLDB | 0.00076161096 |
| 1,289 | Using Probabilistic Information in Data Integration | 1997 | VLDB | 0.00012804879 |
| 3,170 | Quality-driven Integration of Heterogeneous Information Systems | 1999 | VLDB | 7.4482367e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 14,020 | Data Mining Techniques | 1996 | SIGMOD | - |
| 9,818 | Structures, Semantics and Statistics | 2004 | VLDB | 4.2777808e-05 |
| 13,513 | Database Systems Research on Data Mining | 2010 | SIGMOD | - |
| 14,023 | DBMiner: Interactive Mining of Multiple-Level Knowledge in Relational Databases | 1996 | SIGMOD | - |
| 820 | GraphMiner: A Structural Pattern-Mining System for Large Disk-based Graph Databases and Its Applications | 2005 | SIGMOD | 0.00016289354 |
| 12,076 | IBminer: A Text Mining Tool for Constructing and Populating InfoBox Databases and Knowledge Bases | 2013 | VLDB | 4.1945683e-05 |
| 1,289 | Using Probabilistic Information in Data Integration | 1997 | VLDB | 0.00012804879 |
| 2,617 | Extraction and Integration of Partially Overlapping Web Sources | 2013 | VLDB | 8.4462621e-05 |
| 6,792 | Automatically Incorporating New Sources in Keyword Search-Based Data Integration | 2010 | SIGMOD | 4.9249098e-05 |
| 3,961 | BibNetMiner: Mining Bibliographic Information Networks | 2008 | SIGMOD | 6.5876809e-05 |