Database Paper Browser

Back to papers

The History of Histograms (abridged)

Summary: History of histograms across science and industry compressed into a fixed-space abridgment, preserving key events, results, and ideas with emphasis on milestones. Experiments show the compressed history retains a small semantic distance to the full history. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
8960
Venue
VLDB
Year
2003
Pagerank
0.00027378328
Overall Rank
325 | 97.75%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 50 citing papers.

Rank Citing Paper Year Venue Pagerank
71 How Good Are Query Optimizers, Really? 2016 VLDB 0.00059038975
324 Order Preserving Encryption for Numeric Data 2004 SIGMOD 0.00027444645
502 Worst-case Optimal Join Algorithms 2012 PODS 0.00021526612
629 Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors 2009 VLDB 0.00018942366
684 Towards a Robust Query Optimizer: A Principled and Practical Approach 2005 SIGMOD 0.00018179769
692 Pay-as-you-go User Feedback for Dataspace Systems 2008 SIGMOD 0.00018083948
727 On Synopses for Distinct-Value Estimation Under Multiset Operations 2007 SIGMOD 0.00017508726
806 An End-to-End Learning-based Cost Estimator 2020 VLDB 0.00016434274
1,038 Weighted Hypertree Decompositions and Optimal Query Plans 2004 PODS 0.00014492414
1,471 Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia 2013 VLDB 0.00011830111
1,547 Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions 2011 VLDB 0.00011442359
1,758 Sampling-Based Query Re-Optimization 2016 SIGMOD 0.00010655546
1,808 Top-k Query Evaluation with Probabilistic Guarantees 2004 VLDB 0.00010486213
2,009 IO-Top-k: Index-access Optimized Top-k Query Processing 2006 VLDB 9.7977564e-05
2,165 Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation 2015 SIGMOD 9.389622e-05
2,277 Generating Targeted Queries for Database Testing 2008 SIGMOD 9.1241198e-05
2,364 Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries 2020 SIGMOD 8.9554751e-05
2,444 Brighthouse: An Analytic Data Warehouse for Ad-hoc Queries 2008 VLDB 8.8076551e-05
2,549 GORDIAN: Efficient and Scalable Discovery of Composite Keys 2006 VLDB 8.5641554e-05
3,408 Query Optimizers: Time to Rethink the Contract? 2009 SIGMOD 7.1288167e-05
3,449 Learned Cardinality Estimation: A Design Space Exploration and A Comparative Evaluation 2022 VLDB 7.0824319e-05
3,606 EVA: A Symbolic Approach to Accelerating Exploratory Video Analytics with Materialized Views 2022 SIGMOD 6.9260354e-05
3,824 Correlation Sketches for Approximate Join-Correlation Queries 2021 SIGMOD 6.7260705e-05
3,952 Exact Cardinality Query Optimization for Optimizer Testing 2009 VLDB 6.5939652e-05
3,990 FactorJoin: A New Cardinality Estimation Framework for Join Queries 2023 SIGMOD 6.5581983e-05
4,359 Astrid: Accurate Selectivity Estimation for String Predicates using Deep Learning 2021 VLDB 6.2569955e-05
4,831 DigitHist: a Histogram-Based Data Summary with Tight Error Bounds 2017 VLDB 5.8924198e-05
5,632 Bloom Histogram: Path Selectivity Estimation for XML Data with Updates 2004 VLDB 5.4014372e-05
5,905 Exploiting Ordered Dictionaries to Efficiently Construct Histograms with Q-Error Guarantees in SAP HANA 2014 SIGMOD 5.2788785e-05
5,977 Understanding Cardinality Estimation using Entropy Maximization 2010 PODS 5.2455909e-05
6,368 Pre-training Summarization Models of Structured Datasets for Cardinality Estimation 2022 VLDB 5.0937722e-05
6,637 Approximating and Testing k-Histogram Distributions in Sub-linear Time 2012 PODS 4.9816401e-05
6,696 Approximate Summaries for Why and Why-not Provenance 2020 VLDB 4.9581958e-05
7,186 LPLM: A Neural Language Model for Cardinality Estimation of LIKE-Queries 2024 SIGMOD 4.8063731e-05
8,090 Probabilistic Histograms for Probabilistic Data 2009 VLDB 4.5888589e-05
8,174 NOAH: Interactive Spreadsheet Exploration with Dynamic Hierarchical Overviews 2021 VLDB 4.568186e-05
8,384 Consistent and Flexible Selectivity Estimation for High-Dimensional Data 2021 SIGMOD 4.5304673e-05
8,443 Histograms as a Side Effect of Data Movement for Big Data 2014 SIGMOD 4.5119257e-05
9,061 Optimality and Scalability in Lattice Histogram Construction 2009 VLDB 4.4039656e-05
9,237 Determining Exact Quantiles with Randomized Summaries 2024 SIGMOD 4.3690661e-05
9,380 Small Selectivities Matter: Lifting the Burden of Empty Samples 2021 SIGMOD 4.3461329e-05
9,869 Turbo-Charging SPJ Query Plans with Learned Physical Join Operator Selections 2022 VLDB 4.2675361e-05
10,590 ACE: A Cardinality Estimator for Set-Valued Queries 2025 VLDB 4.1945683e-05
10,639 Cardinality Estimation for Having-Clauses 2025 VLDB 4.1945683e-05
10,833 Cardinality Estimation for Similarity Search on High-Dimensional Data Objects: The Impact of Reference Objects 2025 VLDB 4.1945683e-05
10,927 Computing A Well-Representative Summary of Conjunctive Query Results 2024 PODS 4.1945683e-05
11,084 Presto’s History-based Query Optimizer 2024 VLDB 4.1945683e-05
11,821 Are Few Bins Enough: Testing Histogram Distributions 2016 PODS 4.1945683e-05
12,060 Statistics Collection in Oracle Spatial and Graph: Fast Histogram Construction for Complex Geometry Objects 2013 VLDB 4.1945683e-05
12,308 Filtered Statistics 2009 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 50 of 55 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
14 Online Aggregation 1997 SIGMOD 0.0010801504
28 Accurate Estimation Of The Number Of Tuples Satisfying A Condition 1984 SIGMOD 0.00080435857
64 Improved Histograms for Selectivity Estimation of Range Predicates 1996 SIGMOD 0.00063612837
99 On the Propagation of Errors in the Size of Join Results 1991 SIGMOD 0.00050022914
116 Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries 1988 SIGMOD 0.00046148737
141 Selectivity Estimation Without the Attribute Value Independence Assumption 1997 VLDB 0.00041786333
166 Approximate Frequency Counts over Data Streams 2002 VLDB 0.00039361552
182 LEO - DB2's LEarning Optimizer 2001 VLDB 0.00036962631
211 Join Synopses for Approximate Query Answering 1999 SIGMOD 0.00033981214
217 Ripple Joins for Online Aggregation 1999 SIGMOD 0.00033536712
222 Wavelet-Based Histograms for Selectivity Estimation 1998 SIGMOD 0.00032828302
252 Adaptive Selectivity Estimation Using Query Feedback 1994 SIGMOD 0.00030632263
269 Fast Incremental Maintenance of Approximate Histograms 1997 VLDB 0.00029656549
273 Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets 1999 SIGMOD 0.00029390945
275 Approximate Medians and other Quantiles in One Pass and with Limited Memory 1998 SIGMOD 0.00029364901
326 Optimal Histograms with Quality Guarantees 1998 VLDB 0.00027358981
327 Balancing Histogram Optimality and Practicality for Query Result Size Estimation 1995 SIGMOD 0.00027308479
330 The Query By Image Content (QBIC) System 1995 SIGMOD 0.00027229588
344 Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries 2001 VLDB 0.00026702512
361 Histogram-Based Approximation of Set-Valued Query Answers 1999 VLDB 0.00025775749
372 Selectivity Estimation using Probabilistic Models 2001 SIGMOD 0.00025354779
405 Approximate Query Processing Using Wavelets 2000 VLDB 0.00024057494
443 Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets 1999 SIGMOD 0.00022996573
512 STHoles: A Multidimensional Workload-Aware Histogram 2001 SIGMOD 0.00021380733
526 A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data 1997 VLDB 0.00021044221
529 Self-tuning Histograms: Building Histograms Without Looking at Data 1999 SIGMOD 0.00020828852
530 Random Sampling for Histogram Construction: How much is enough? 1998 SIGMOD 0.00020803682
790 Exploiting Statistics on Query Expressions for Optimization 2002 SIGMOD 0.0001663283
808 Universality of Serial Histograms 1993 VLDB 0.00016432772
842 Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data 2001 SIGMOD 0.00016031973
852 Dynamic Multidimensional Histograms 2002 SIGMOD 0.00015941524
956 How to Summarize the Universe: Dynamic Maintenance of Quantiles 2002 VLDB 0.00015066967
996 Approximating Multi-Dimensional Aggregate Range Queries Over Real Attributes 2000 SIGMOD 0.00014741524
1,020 An Instant and Accurate Size Estimation Method for Joins and Selection in a Retrieval-Intensive Environment 1993 SIGMOD 0.00014624893
1,046 Estimating the Selectivity of XML Path Expressions for Internet Scale Applications 2001 VLDB 0.00014462307
1,120 Global Optimization of Histograms 2001 SIGMOD 0.00013856211
1,127 Dynamic Maintenance of Wavelet-Based Histograms 2000 VLDB 0.00013819179
1,146 Estimating Alphanumeric Selectivity in the Presence of Wildcards 1996 SIGMOD 0.00013679782
1,241 Multi-dimensional Selectivity Estimation Using Compressed Histogram Information 1999 SIGMOD 0.00013097578
1,379 Substring Selectivity Estimation 1999 PODS 0.00012286879
1,400 Wavelet Synopses with Error Guarantees 2002 SIGMOD 0.00012191684
1,695 Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-Size Estimation 1999 VLDB 0.00010882793
2,010 StatiX: Making XML Count 2002 SIGMOD 9.7970026e-05
2,053 Selectivity Estimation in Spatial Databases 1999 SIGMOD 9.6728745e-05
2,202 A Scalable Hash Ripple Join Algorithm 2002 SIGMOD 9.2987417e-05
2,316 Statistical Synopses for Graph-Structured XML Databases 2002 SIGMOD 9.0419716e-05
2,556 Probabilistic Optimization of Top N Queries 1999 VLDB 8.5465733e-05
2,974 Estimating the Selectivity of Spatial Queries Using the 'Correlation' Fractal Dimension 1995 VLDB 7.789769e-05
3,035 Multi-Dimensional Substring Selectivity Estimation 1999 VLDB 7.6748073e-05
3,113 Structure and Value Synopses for XML Data Graphs 2002 VLDB 7.5469926e-05
Previous Page 1 / 2 Next

Semantically Similar Papers