Mining Search Engine Query Logs via Suggestion Sampling
Summary: Monte Carlo samplers for public autocomplete interfaces to reveal hidden suggestion databases. Uniform and popularity-proportional sampling enable unbiased estimation of keyword popularity, query volume, and exposure to negative content, with empirical validation on public logs and two services. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ziv Bar-Yossef
- 2. Maxim Gurevich
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,474 | Measure-driven Keyword-Query Expansion | 2009 | VLDB | 6.1528736e-05 |
| 7,890 | Mining a Search Engine’s Corpus: Efficient Yet Unbiased Sampling and Aggregate Estimation | 2011 | SIGMOD | 4.6249533e-05 |
| 8,206 | Query Expansion Based on Clustered Results | 2011 | VLDB | 4.5586037e-05 |
| 8,684 | Unbiased Estimation of Size and Other Aggregates Over Hidden Web Databases | 2010 | SIGMOD | 4.4677591e-05 |
| 9,548 | Optimal Algorithms for Crawling a Hidden Database in the Web | 2012 | VLDB | 4.3258142e-05 |
| 9,549 | Attribute Domain Discovery for Hidden Web Databases | 2011 | SIGMOD | 4.3258142e-05 |
| 12,088 | Rank Discovery From Web Databases | 2013 | VLDB | 4.1945683e-05 |
| 12,189 | Randomized Generalization for Aggregate Suppression Over Hidden Web Databases | 2011 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 357 | Random Sampling from B+ trees | 1989 | VLDB | 0.00026020098 |
| 2,385 | Comparing and Aggregating Rankings with Ties | 2004 | PODS | 8.9247846e-05 |
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |
Previous
Page 1 / 1
Next