Deeper: A Data Enrichment System Powered by Deep Web
Summary: Deeper links a local DB to a hidden DB via API-driven enrichment with crawl costs scaling to the local data. It introduces a task-focused hidden-web crawling strategy and demonstrates end-to-end enrichment of a publication DB, outperforming naive crawlers. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Pei Wang
- 2. Yongjun He
- 3. Ryan Shea
- 4. Jiannan Wang
- 5. Eugene Wu
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,678 | Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment | 2019 | SIGMOD | 4.4702119e-05 |
| 9,273 | ActiveDeeper: A Model-based Active Data Enrichment System | 2020 | VLDB | 4.3649603e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 181 | Mining Frequent Patterns without Candidate Generation | 2000 | SIGMOD | 0.00036992674 |
| 420 | InfoGather: Entity Augmentation and Attribute Discovery By Holistic Matching with Web Tables | 2012 | SIGMOD | 0.00023719065 |
| 518 | Data Integration for the Relational Web | 2009 | VLDB | 0.00021158934 |
| 1,660 | Data Markets in the Cloud: An Opportunity for the Database Community | 2011 | VLDB | 0.00010979534 |
| 3,229 | InfoGather+: Semantic Matching and Annotation of Numeric and Time-Varying Attributes in Web Tables | 2013 | SIGMOD | 7.3393682e-05 |
| 4,838 | Finding Patterns in a Knowledge Base using Keywords to Compose Table Answers | 2014 | VLDB | 5.8887949e-05 |
| 7,890 | Mining a Search Engine’s Corpus: Efficient Yet Unbiased Sampling and Aggregate Estimation | 2011 | SIGMOD | 4.6249533e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 672 | An Interactive Clustering-based Approach to Integrating Source Query Interfaces on the Deep Web | 2004 | SIGMOD | 0.00018355746 |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 667 | Incremental Knowledge Base Construction Using DeepDive | 2015 | VLDB | 0.00018440557 |
| 4,229 | Harnessing the Deep Web: Present and Future | 2009 | CIDR | 6.3399547e-05 |
| 12,240 | Creating and Exploring Web Form Repositories | 2010 | SIGMOD | 4.1945683e-05 |
| 1,537 | Google's Deep-Web Crawl | 2008 | VLDB | 0.00011465704 |
| 9,433 | Exploration of Deep Web Repositories | 2011 | VLDB | 4.3431757e-05 |
| 4,106 | Extracting Databases from Dark Data with DeepDive | 2016 | SIGMOD | 6.4456184e-05 |
| 8,678 | Progressive Deep Web Crawling Through Keyword Queries For Data Enrichment | 2019 | SIGMOD | 4.4702119e-05 |
| 9,273 | ActiveDeeper: A Model-based Active Data Enrichment System | 2020 | VLDB | 4.3649603e-05 |