Approximating Aggregate Queries about Web Pages via Random Walks
Summary: Random-walk based approach to approximate web-page aggregates; a novel walk on a dynamically built regular undirected graph yields near-uniform samples. Estimates search-engine coverage, domain composition, and average page size with strong empirical accuracy under limited resources. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ziv Bar-Yossef
- 2. Alexander Berg
- 3. Steve Chien
- 4. Jittat Fakcharoenphol
- 5. Dror Weitz
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,363 | An I/O-Efficient Disk-based Graph System for Scalable Second-Order Random Walk of Large Graphs | 2022 | VLDB | 4.7523184e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,086 | Efficient Personalized PageRank Computation: A Spanning Forests Sampling Based Approach | 2022 | SIGMOD | 4.8381004e-05 |
| 4,922 | READS: A Random Walk Approach for Efficient and Accurate Dynamic SimRank | 2017 | VLDB | 5.8233726e-05 |
| 3,693 | The Web as a graph | 2000 | PODS | 6.8356209e-05 |
| 7,890 | Mining a Search Engine’s Corpus: Efficient Yet Unbiased Sampling and Aggregate Estimation | 2011 | SIGMOD | 4.6249533e-05 |
| 8,684 | Unbiased Estimation of Size and Other Aggregates Over Hidden Web Databases | 2010 | SIGMOD | 4.4677591e-05 |
| 5,730 | Walk, Not Wait: Faster Sampling Over Online Social Networks | 2015 | VLDB | 5.3506029e-05 |
| 595 | Estimating PageRank on Graph Streams | 2008 | PODS | 0.00019507721 |
| 4,527 | On the Embeddability of Random Walk Distances | 2013 | VLDB | 6.1083926e-05 |
| 1,740 | A General Framework for Estimating Graphlet Statistics via Random Walk | 2017 | VLDB | 0.0001071792 |
| 5,140 | A Random Walk Approach to Sampling Hidden Databases | 2007 | SIGMOD | 5.668209e-05 |