HDSampler: Revealing Data Behind Web Form Interfaces
Summary: HDSampler is the first practical system for sampling structured hidden web databases via web forms. It enables efficient sampling and accurate aggregate queries, demonstrated on Google Base to reveal marginal attribute distributions in minutes. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Anirban Maiti
- 2. Arjun Dasgupta
- 3. Nan Zhang
- 4. Gautam Das
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 13,411 | HDBTracker: Monitoring the Aggregates On Dynamic Hidden Web Databases | 2014 | VLDB | - |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 1 of 1 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,549 | Attribute Domain Discovery for Hidden Web Databases | 2011 | SIGMOD | 4.3258142e-05 |
| 1,537 | Google's Deep-Web Crawl | 2008 | VLDB | 0.00011465704 |
| 9,433 | Exploration of Deep Web Repositories | 2011 | VLDB | 4.3431757e-05 |
| 3,950 | Probe, Count, and Classify: Categorizing Hidden-Web Databases | 2001 | SIGMOD | 6.5953844e-05 |
| 12,189 | Randomized Generalization for Aggregate Suppression Over Hidden Web Databases | 2011 | VLDB | 4.1945683e-05 |
| 1,492 | Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection | 2002 | VLDB | 0.00011694396 |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 8,684 | Unbiased Estimation of Size and Other Aggregates Over Hidden Web Databases | 2010 | SIGMOD | 4.4677591e-05 |
| 13,411 | HDBTracker: Monitoring the Aggregates On Dynamic Hidden Web Databases | 2014 | VLDB | - |
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |