Database Paper Browser

Back to papers

Building Wavelet Histograms on Large Data in MapReduce

Summary: Proposes exact and approximate wavelet histogram algorithms for MapReduce, cutting communication and runtime vs naive approaches. Implemented in Hadoop and evaluated on a 16-node cluster with real and synthetic data, showing large improvements. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10348
Venue
VLDB
Year
2012
Pagerank
5.2791351e-05
Overall Rank
5,903 | 58.94%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 7 of 7 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 22 of 22 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
7 Optimal Aggregation Algorithms for Middleware [Extended Abstract] 2001 PODS 0.0015496097
22 SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets 2008 VLDB 0.0008456613
42 A Comparison of Approaches to Large-Scale Data Analysis 2009 SIGMOD 0.00073498298
64 Improved Histograms for Selectivity Estimation of Range Predicates 1996 SIGMOD 0.00063612837
70 Hive - A Warehousing Solution Over a Map-Reduce Framework 2009 VLDB 0.00059533166
157 HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads 2009 VLDB 0.00040397359
168 MAD Skills: New Analysis Practices for Big Data 2009 VLDB 0.00038946305
222 Wavelet-Based Histograms for Selectivity Estimation 1998 SIGMOD 0.00032828302
326 Optimal Histograms with Quality Guarantees 1998 VLDB 0.00027358981
344 Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries 2001 VLDB 0.00026702512
405 Approximate Query Processing Using Wavelets 2000 VLDB 0.00024057494
447 Efficient Parallel Set-Similarity Joins Using MapReduce 2010 SIGMOD 0.00022900171
780 Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience 2009 VLDB 0.00016775082
794 Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) 2010 VLDB 0.00016605103
835 Finding Frequent Items in Data Streams 2008 VLDB 0.00016109621
1,127 Dynamic Maintenance of Wavelet-Based Histograms 2000 VLDB 0.00013819179
1,400 Wavelet Synopses with Error Guarantees 2002 SIGMOD 0.00012191684
1,615 The Performance of MapReduce: An In-depth Study 2010 VLDB 0.00011132319
2,736 Online Aggregation and Continuous Query support in MapReduce 2010 SIGMOD 8.2043187e-05
2,989 KLEE: A Framework for Distributed Top-k Query Algorithms 2005 VLDB 7.7733083e-05
6,431 Finding Global Icebergs over Distributed Data Sets 2006 PODS 5.0654592e-05
Previous Page 1 / 1 Next

Semantically Similar Papers