Database Paper Browser

Back to papers

Effective Use of Block-Level Sampling in Statistics Estimation

Summary: Block-level sampling is efficient but error-prone for statistics. Proposes a two-phase adaptive histogram algorithm using a phase-1 sample and a subset-extraction technique to adapt estimators to block-level distinct-value data; experiments show accuracy and speed gains. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3526
Venue
SIGMOD
Year
2004
Pagerank
0.00010523169
Overall Rank
1,797 | 87.51%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 16 of 16 citing papers.

Rank Citing Paper Year Venue Pagerank
424 Tuning Database Configuration Parameters with iTuned 2009 VLDB 0.00023616398
692 Pay-as-you-go User Feedback for Dataspace Systems 2008 SIGMOD 0.00018083948
2,779 Hashed Samples: Selectivity Estimators For Set Similarity Selection Queries 2008 VLDB 8.1320575e-05
3,279 Early Accurate Results for Advanced Analytics on MapReduce 2012 VLDB 7.2855494e-05
3,878 Data Canopy: Accelerating Exploratory Statistical Analysis 2017 SIGMOD 6.6731435e-05
4,435 Sampling Dirty Data for Matching Attributes 2010 SIGMOD 6.1918164e-05
5,014 Dynamically Optimizing Queries over Large Scale Data Platforms 2014 SIGMOD 5.7586174e-05
5,140 A Random Walk Approach to Sampling Hidden Databases 2007 SIGMOD 5.668209e-05
6,136 Scalable Progressive Analytics on Big Data in the Cloud 2013 VLDB 5.1928748e-05
6,170 PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba 2023 SIGMOD 5.171601e-05
6,493 Joins on Samples: A Theoretical Guide for Practitioners 2020 VLDB 5.0424713e-05
7,059 Adaptive and Robust Query Execution for Lakehouses at Scale 2024 VLDB 4.8477825e-05
7,759 Dscaler: Synthetically Scaling A Given Relational Database 2016 VLDB 4.6593145e-05
8,393 LAQy: Efficient and Reusable Query Approximations via Lazy Sampling 2023 SIGMOD 4.5280102e-05
8,835 Learning-based Property Estimation with Polynomials 2024 SIGMOD 4.4394021e-05
10,498 PLM4NDV: Minimizing Data Access for Number of Distinct Values Estimation with Pre-trained Language Models 2025 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 10 of 10 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers