Back to papers
Effective Entity Augmentation By Querying External Data Sources
Summary: Progressive, feedback-driven method that learns per-entity keyword-query strategies to extract relevant attributes from external sources exposed via keyword-search only. Iteratively refines queries to handle heterogeneous representations and sparse relevant tuples, minimizing manual effort while rapidly delivering accurate augmentations.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 13174
- Venue
- VLDB
- Year
- 2023
- Pagerank
- 4.4660032e-05
- Overall Rank
- 8,696 | 39.51%
- DOI
-
10.14778/3611479.3611535
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 276 |
Efficient IR-Style Keyword Search over Relational Databases |
2003 |
VLDB |
0.00029336949 |
| 398 |
Big Data Integration |
2013 |
VLDB |
0.00024372588 |
| 420 |
InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables |
2012 |
SIGMOD |
0.00023719065 |
| 1,147 |
Web-scale Data Integration: You can only afford to Pay As You Go |
2007 |
CIDR |
0.00013677658 |
| 1,187 |
JOSIE: Overlap Set Similarity Search for Finding Joinable Tables in Data Lakes |
2019 |
SIGMOD |
0.00013443639 |
| 1,277 |
The Data Civilizer System |
2017 |
CIDR |
0.00012879695 |
| 1,537 |
Google's Deep-Web Crawl |
2008 |
VLDB |
0.00011465704 |
| 2,209 |
Data Integration: After the Teenage Years |
2017 |
PODS |
9.2868035e-05 |
| 3,750 |
Data Acquisition for Improving Machine Learning Models |
2021 |
VLDB |
6.7895763e-05 |
| 3,824 |
Correlation Sketches for Approximate Join-Correlation Queries |
2021 |
SIGMOD |
6.7260705e-05 |
| 3,985 |
A First Tutorial on Dataspaces |
2008 |
VLDB |
6.5626153e-05 |
| 4,422 |
Interactive Join Query Inference with JIM |
2014 |
VLDB |
6.2008389e-05 |
| 5,032 |
Actively Soliciting Feedback for Query Answers in Keyword Search-Based Data Integration |
2013 |
VLDB |
5.748807e-05 |
| 8,678 |
Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment |
2019 |
SIGMOD |
4.4702119e-05 |
| 9,432 |
Aggregate Estimation Over Dynamic Hidden Web Databases |
2014 |
VLDB |
4.3431757e-05 |
| 9,433 |
Exploration of Deep Web Repositories |
2011 |
VLDB |
4.3431757e-05 |
Semantically Similar Papers