Unbiased Estimation of Size and Other Aggregates Over Hidden Web Databases
Summary: Unbiased estimation of hidden web database size and aggregates via a restrictive query interface. Introduces query-efficient estimators with low variance, enabling approximate query processing and theoretical guarantees, backed by experiments. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Arjun Dasgupta
- 2. Xin Jin
- 3. Nan Zhang
- 4. Bradley Jewell
- 5. Gautam Das
Incoming Citations (Sorted by Pagerank)
Showing 13 of 13 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 145 | Quickly Generating Billion-Record Synthetic Databases | 1994 | SIGMOD | 0.0004138408 |
| 234 | Crawling the Hidden Web | 2001 | VLDB | 0.00032018108 |
| 1,096 | Minimal Probing: Supporting Expensive Predicates for Top-k Queries | 2002 | SIGMOD | 0.00014120512 |
| 1,260 | Dynamic Sample Selection for Approximate Query Processing | 2003 | SIGMOD | 0.00012993347 |
| 1,492 | Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection | 2002 | VLDB | 0.00011694396 |
| 2,813 | Mining Search Engine Query Logs via Suggestion Sampling | 2008 | VLDB | 8.0773142e-05 |
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |
| 12,301 | Privacy Preservation of Aggregates in Hidden Databases: Why and How? | 2009 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 593 | Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies | 1996 | VLDB | 0.00019536993 |
| 12,567 | Online Estimation For Subset-Based SQL Queries | 2005 | VLDB | 4.1945683e-05 |
| 7,890 | Mining a Search Engine’s Corpus: Efficient Yet Unbiased Sampling and Aggregate Estimation | 2011 | SIGMOD | 4.6249533e-05 |
| 12,112 | Aggregate Suppression for Enterprise Search Engines | 2012 | SIGMOD | 4.1945683e-05 |
| 3,950 | Probe, Count, and Classify: Categorizing Hidden-Web Databases | 2001 | SIGMOD | 6.5953844e-05 |
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 12,301 | Privacy Preservation of Aggregates in Hidden Databases: Why and How? | 2009 | SIGMOD | 4.1945683e-05 |
| 12,189 | Randomized Generalization for Aggregate Suppression Over Hidden Web Databases | 2011 | VLDB | 4.1945683e-05 |
| 9,432 | Aggregate Estimation Over Dynamic Hidden Web Databases | 2014 | VLDB | 4.3431757e-05 |