Database Paper Browser

Back to papers

Optimal Histograms with Quality Guarantees

Summary: Optimal histograms with quality guarantees: compute bucket boundaries to minimize error for a fixed bucket count under broad metrics (incl. V-Optimality); time O(d^2) in distinct values. Also offers fast heuristics with provable space–accuracy tradeoffs and per-selectivity guarantees to isolate outliers. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
8495
Venue
VLDB
Year
1998
Pagerank
0.00027358981
Overall Rank
326 | 97.74%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 66 citing papers.

Rank Citing Paper Year Venue Pagerank
43 Models and Issues in Data Stream Systems 2002 PODS 0.00072723062
325 The History of Histograms (abridged) 2003 VLDB 0.00027378328
449 Approximate Query Processing: Taming the TeraBytes! A Tutorial 2001 VLDB 0.00022846068
502 Worst-case Optimal Join Algorithms 2012 PODS 0.00021526612
512 STHoles: A Multidimensional Workload-Aware Histogram 2001 SIGMOD 0.00021380733
629 Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors 2009 VLDB 0.00018942366
684 Towards a Robust Query Optimizer: A Principled and Practical Approach 2005 SIGMOD 0.00018179769
852 Dynamic Multidimensional Histograms 2002 SIGMOD 0.00015941524
1,120 Global Optimization of Histograms 2001 SIGMOD 0.00013856211
1,241 Multi-dimensional Selectivity Estimation Using Compressed Histogram Information 1999 SIGMOD 0.00013097578
1,379 Substring Selectivity Estimation 1999 PODS 0.00012286879
1,695 Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-Size Estimation 1999 VLDB 0.00010882793
1,737 QuickSel: Quick Selectivity Learning with Mixture Models 2020 SIGMOD 0.00010720294
1,935 A Data- and Workload-Aware Algorithm for Range Queries Under Differential Privacy 2014 VLDB 0.00010032967
2,171 Selectivity Estimation For Boolean Queries 2000 PODS 9.3807165e-05
2,364 Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries 2020 SIGMOD 8.9554751e-05
2,454 Efficient Computation of Reverse Skyline Queries 2007 VLDB 8.778281e-05
2,465 Principled Evaluation of Differentially Private Algorithms using DPBench 2016 SIGMOD 8.7518123e-05
2,629 Online Outlier Detection in Sensor Data Using Non-Parametric Models 2006 VLDB 8.4160309e-05
2,748 REHIST: Relative Error Histogram Construction Algorithms 2004 VLDB 8.1785955e-05
2,878 Sampling Time-Based Sliding Windows in Bounded Space 2008 SIGMOD 7.9706235e-05
2,914 DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees 2019 VLDB 7.9118579e-05
3,310 Optimal and Approximate Computation of Summary Statistics for Range Aggregates 2001 PODS 7.2408955e-05
3,313 Quality and Efficiency in Kernel Density Estimates for Large Data 2013 SIGMOD 7.2381634e-05
3,619 Fast Algorithms For Hierarchical Range Histogram Construction 2002 PODS 6.9084829e-05
3,651 Conditional Selectivity for Statistics on Query Expressions 2004 SIGMOD 6.8768678e-05
3,665 Ad-hoc Top-k Query Answering for Data Streams 2007 VLDB 6.8633354e-05
3,691 Kernel-Based Skyline Cardinality Estimation 2009 SIGMOD 6.8383587e-05
3,719 Space efficiency in Synopsis construction algorithms 2005 VLDB 6.8204683e-05
4,017 Optimal Histograms for Hierarchical Range Queries (Extended Abstract) 2000 PODS 6.524501e-05
4,359 Astrid: Accurate Selectivity Estimation for String Predicates using Deep Learning 2021 VLDB 6.2569955e-05
4,438 Selectivity Estimation for Fuzzy String Predicates in Large Data Sets 2005 VLDB 6.1898903e-05
4,681 Adaptive Sampling for Rapidly Matching Histograms 2018 VLDB 6.0034918e-05
4,831 DigitHist: a Histogram-Based Data Summary with Tight Error Bounds 2017 VLDB 5.8924198e-05
5,065 Hierarchical Subspace Sampling: A Unified Framework for High Dimensional Data Reduction, Selectivity Estimation and Nearest Neighbor Search 2002 SIGMOD 5.7247716e-05
5,082 A Comparison of Selectivity Estimators for Range Queries on Metric Attributes 1999 SIGMOD 5.711623e-05
5,451 NashDB: An End-to-End Economic Method for Elastic Database Fragmentation, Replication, and Provisioning 2018 SIGMOD 5.5002949e-05
5,632 Bloom Histogram: Path Selectivity Estimation for XML Data with Updates 2004 VLDB 5.4014372e-05
5,879 Fast and Near–Optimal Algorithms for Approximating Distributions by Histograms 2015 PODS 5.2908101e-05
5,903 Building Wavelet Histograms on Large Data in MapReduce 2012 VLDB 5.2791351e-05
6,154 A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins 2017 VLDB 5.1815134e-05
6,491 Robust Estimation With Sampling and Approximate Pre-Aggregation 2003 VLDB 5.0429323e-05
6,637 Approximating and Testing k-Histogram Distributions in Sub-linear Time 2012 PODS 4.9816401e-05
6,677 Categorical Skylines for Streaming Data 2008 SIGMOD 4.9657435e-05
6,694 Optimal Splitters for Temporal and Multi-version Databases 2013 SIGMOD 4.9586454e-05
6,740 Combining Aggregation and Sampling (Nearly) Optimally for Approximate Query Processing 2021 SIGMOD 4.944395e-05
7,313 Pythia: Data Dependent Differentially Private Algorithm Selection 2017 SIGMOD 4.7651627e-05
7,358 Weighted Distinct Sampling: Cardinality Estimation for SPJ Queries 2021 SIGMOD 4.7529363e-05
7,459 Compact Histograms for Hierarchical Identifiers 2006 VLDB 4.7243492e-05
7,581 Synopses for Query Optimization: A Space-Complexity Perspective 2004 PODS 4.7057641e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers