Database Paper Browser

Back to papers

Tracking Join and Self-Join Sizes in Limited Storage

Summary: Approximate self-join sizes under insertions/deletions in tiny space; tug-of-war sketches outperform sample-and-count and plain sampling for skew detection and self-join estimation. Proves lower bounds showing sampling-based compact join signatures are essentially optimal absent assumptions, but proposes tug-of-war-based join signatures whose error scales with relations' self-join sizes and can significantly beat sampling. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
1157
Venue
PODS
Year
1999
Pagerank
0.00020376603
Overall Rank
549 | 96.19%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 41 of 41 citing papers.

Rank Citing Paper Year Venue Pagerank
43 Models and Issues in Data Stream Systems 2002 PODS 0.00072723062
344 Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries 2001 VLDB 0.00026702512
449 Approximate Query Processing: Taming the TeraBytes! A Tutorial 2001 VLDB 0.00022846068
475 Mining Database Structure; Or, How to Build a Data Quality Browser 2002 SIGMOD 0.00022303253
502 Worst-case Optimal Join Algorithms 2012 PODS 0.00021526612
865 What’s Hot and What’s Not: Tracking Most Frequent Items Dynamically 2003 PODS 0.00015808172
1,064 Processing Complex Aggregate Queries over Data Streams 2002 SIGMOD 0.00014356481
1,193 Join Size Estimation Subject to Filter Conditions 2015 VLDB 0.00013414989
1,369 Random Sampling over Joins Revisited 2018 SIGMOD 0.00012339777
1,392 Sketching Streams Through the Net: Distributed Approximate Query Tracking 2005 VLDB 0.00012229045
1,472 Space Efficient Mining of Multigraph Streams 2005 PODS 0.00011828662
1,758 Sampling-Based Query Re-Optimization 2016 SIGMOD 0.00010655546
2,254 Two-Level Sampling for Join Size Estimation 2017 SIGMOD 9.1897043e-05
2,931 Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles 2005 SIGMOD 7.8697258e-05
2,969 Estimating Join Selectivities using Bandwidth-Optimized Kernel Density Models 2017 VLDB 7.7974762e-05
3,041 Sketching Probabilistic Data Streams 2007 SIGMOD 7.6697078e-05
3,102 Processing Set Expressions over Continuous Update Streams 2003 SIGMOD 7.5586568e-05
3,486 Holistic UDAFs at Streaming Speeds 2004 SIGMOD 7.0502199e-05
3,543 Approximation Techniques for Spatial Data 2004 SIGMOD 6.9917053e-05
3,614 Persistent Data Sketching 2015 SIGMOD 6.9147318e-05
4,133 Memory-Limited Execution of Windowed Stream Joins 2004 VLDB 6.4196026e-05
5,220 Similarity Join Size Estimation using Locality Sensitive Hashing 2011 VLDB 5.6216111e-05
5,880 COMPASS: Online Sketch-based Query Optimization for In-Memory Databases 2021 SIGMOD 5.2898074e-05
5,977 Understanding Cardinality Estimation using Entropy Maximization 2010 PODS 5.2455909e-05
7,150 Histograms Revisited: When are histograms the best approximation method for aggregates over joins? 2005 PODS 4.8163484e-05
7,164 SKT: A One-Pass Multi-Sketch Data Analytics Accelerator 2021 VLDB 4.8131514e-05
7,334 Streaming in a Connected World: Querying and Tracking Distributed Data Streams 2007 SIGMOD 4.7604215e-05
7,581 Synopses for Query Optimization: A Space-Complexity Perspective 2004 PODS 4.7057641e-05
7,699 Sketch-based Geometric Monitoring of Distributed Stream Queries 2013 VLDB 4.6746076e-05
7,728 Consistent Histograms In The Presence of Distinct Value Counts 2009 VLDB 4.666214e-05
7,827 Containment Join Size Estimation: Models and Methods 2003 SIGMOD 4.6411831e-05
8,697 Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries 2024 SIGMOD 4.4657888e-05
8,717 Scotch: Generating FPGA-Accelerators for Sketching at Line Rate 2021 VLDB 4.4614498e-05
9,041 TreeSensing: Linearly Compressing Sketches with Flexibility 2023 SIGMOD 4.4039656e-05
9,628 Approximate Sketches 2024 SIGMOD 4.3143499e-05
10,619 Data-Agnostic Cardinality Learning from Imperfect Workloads 2025 VLDB 4.1945683e-05
10,983 A Universal Sketch for Estimating Heavy Hitters and Per-Element Frequency Moments in Data Streams with Bounded Deletions 2024 SIGMOD 4.1945683e-05
11,025 Sampling Methods for Inner Product Sketching 2024 VLDB 4.1945683e-05
11,168 Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation 2023 PODS 4.1945683e-05
11,957 Monitoring Distributed Streams using Convex Decompositions 2015 VLDB 4.1945683e-05
12,531 Join-Distinct Aggregate Estimation over Update Streams 2005 PODS 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 13 of 13 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers