Database Paper Browser

Back to papers

Space-Efficient Online Computation of Quantile Summaries

Summary: Online algorithm for epsilon-approximate quantile summaries on streaming data, with worst-case space O((1/epsilon) log(epsilon N)). Improves upon O((1/epsilon) log^2(epsilon N); requires no a priori knowledge of N; experiments show space tighter than both worst-case and prior methods. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
3257
Venue
SIGMOD
Year
2001
Pagerank
0.00044744986
Overall Rank
126 | 99.13%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 50 of 82 citing papers.

Rank Citing Paper Year Venue Pagerank
43 Models and Issues in Data Stream Systems 2002 PODS 0.00072723062
166 Approximate Frequency Counts over Data Streams 2002 VLDB 0.00039361552
344 Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries 2001 VLDB 0.00026702512
402 Mergeable Summaries 2012 PODS 0.00024196343
449 Approximate Query Processing: Taming the TeraBytes! A Tutorial 2001 VLDB 0.00022846068
595 Estimating PageRank on Graph Streams 2008 PODS 0.00019507721
696 BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics 2020 VLDB 0.00018048935
785 StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time 2002 VLDB 0.00016664156
835 Finding Frequent Items in Data Streams 2008 VLDB 0.00016109621
848 Approximate Counts and Quantiles over Sliding Windows 2004 PODS 0.0001597308
852 Dynamic Multidimensional Histograms 2002 SIGMOD 0.00015941524
956 How to Summarize the Universe: Dynamic Maintenance of Quantiles 2002 VLDB 0.00015066967
1,064 Processing Complex Aggregate Queries over Data Streams 2002 SIGMOD 0.00014356481
1,355 SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions 2009 VLDB 0.00012404572
1,392 Sketching Streams Through the Net: Distributed Approximate Query Tracking 2005 VLDB 0.00012229045
1,482 Automating Large-Scale Data Quality Verification 2018 VLDB 0.00011725533
1,664 On Multi-Column Foreign Key Discovery 2010 VLDB 0.00010976887
1,717 Approximate Join Processing Over Data Streams 2003 SIGMOD 0.00010793312
1,895 VF2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning 2021 SIGMOD 0.00010180896
2,080 Optimal Sampling From Distributed Streams 2010 PODS 9.5899129e-05
2,448 Multi-Dimensional Regression Analysis of Time-Series Data Streams 2002 VLDB 8.8032353e-05
2,643 Camel: Managing Data for Efficient Stream Learning 2022 SIGMOD 8.384956e-05
2,914 DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees 2019 VLDB 7.9118579e-05
2,931 Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles 2005 SIGMOD 7.8697258e-05
2,953 Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries 2018 VLDB 7.8267643e-05
2,955 Space- and Time-Efficient Deterministic Algorithms for Biased Quantiles over Data Streams 2006 PODS 7.8239173e-05
3,041 Sketching Probabilistic Data Streams 2007 SIGMOD 7.6697078e-05
3,313 Quality and Efficiency in Kernel Density Estimates for Large Data 2013 SIGMOD 7.2381634e-05
3,319 Sketching Linear Classifiers over Data Streams 2018 SIGMOD 7.226439e-05
3,385 Estimating Statistical Aggregates on Probabilistic Data Streams 2007 PODS 7.1580391e-05
3,486 Holistic UDAFs at Streaming Speeds 2004 SIGMOD 7.0502199e-05
3,798 Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees 2020 VLDB 6.7592302e-05
3,808 SketchML: Accelerating Distributed Machine Learning with Data Sketches 2018 SIGMOD 6.7455428e-05
3,860 Fast Data Stream Algorithms using Associative Memories 2007 SIGMOD 6.6902516e-05
3,991 Beyond Simple Aggregates: Indexing for Summary Queries 2011 PODS 6.5553055e-05
4,031 Approximate Quantiles and the Order of the Stream 2006 PODS 6.5121141e-05
4,076 Quantiles over Data Streams: An Experimental Study 2013 SIGMOD 6.4680854e-05
4,172 The Adversarial Robustness of Sampling 2020 PODS 6.3879072e-05
4,190 Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks 2012 PODS 6.3739017e-05
4,249 Optimal Tracking of Distributed Heavy Hitters and Quantiles 2009 PODS 6.3245666e-05
4,966 Relative Error Streaming Quantiles 2021 PODS 5.7959749e-05
4,975 An Experimental Evaluation of Large Scale GBDT Systems 2019 VLDB 5.79026e-05
5,062 Randomized Multi-pass Streaming Skyline Algorithms 2009 VLDB 5.7268277e-05
5,117 Sampling Algorithms in a Stream Operator 2005 SIGMOD 5.6825418e-05
5,457 Fast and Approximate Stream Mining of Quantiles and Frequencies Using Graphics Processors 2005 SIGMOD 5.4970777e-05
5,535 Lightweight Cardinality Estimation in LSM-based Systems 2018 SIGMOD 5.4539235e-05
5,627 KLL± Approximate Quantile Sketches over Dynamic Datasets 2021 VLDB 5.403782e-05
5,956 A Tight Lower Bound for Comparison-Based Quantile Summaries 2020 PODS 5.2566971e-05
6,311 VergeDB: A Database for IoT Analytics on Edge Devices 2021 CIDR 5.1161316e-05
6,469 Materialization and Reuse Optimizations for Production Data Science Pipelines 2022 SIGMOD 5.0519488e-05
Previous Page 1 / 2 Next

Outgoing Citations (Sorted by Pagerank)

Showing 6 of 6 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers