Join-Distinct Aggregate Estimation over Update Streams
Summary: First space-efficient algorithms for Join-Distinct (distinct-projection over joins) on general update streams (inserts+deletes), introducing JD sketches — a new class of hash-based synopses that are built per stream and combinable. Probabilistic estimators yield low-error, high-confidence estimates with small per-update time and space, backed by near-optimal lower bounds and empirical validation. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Sumit Ganguly
- 2. Minos Garofalakis
- 3. Amit Kumar
- 4. Rajeev Rastogi
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 59 | Sampling-Based Estimation of the Number of Distinct Values of an Attribute | 1995 | VLDB | 0.00064501896 |
| 308 | Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports | 2001 | VLDB | 0.00028142852 |
| 344 | Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries | 2001 | VLDB | 0.00026702512 |
| 378 | Towards Estimation Error Guarantees for Distinct Values | 2000 | PODS | 0.0002497492 |
| 549 | Tracking Join and Self-Join Sizes in Limited Storage | 1999 | PODS | 0.00020376603 |
| 956 | How to Summarize the Universe: Dynamic Maintenance of Quantiles | 2002 | VLDB | 0.00015066967 |
| 1,064 | Processing Complex Aggregate Queries over Data Streams | 2002 | SIGMOD | 0.00014356481 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 550 | Hash-Partitioned Join Method Using Dynamic Destaging Strategy | 1988 | VLDB | 0.00020359891 |
| 3,041 | Sketching Probabilistic Data Streams | 2007 | SIGMOD | 7.6697078e-05 |
| 6,853 | On Joining and Caching Stochastic Streams | 2005 | SIGMOD | 4.9070864e-05 |
| 1,717 | Approximate Join Processing Over Data Streams | 2003 | SIGMOD | 0.00010793312 |
| 7,547 | Sketching Unaggregated Data Streams for Subpopulation-Size Queries | 2007 | PODS | 4.7144329e-05 |
| 4,133 | Memory-Limited Execution of Windowed Stream Joins | 2004 | VLDB | 6.4196026e-05 |
| 8,697 | Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join Queries | 2024 | SIGMOD | 4.4657888e-05 |
| 11,833 | Streaming Algorithms for Robust Distinct Elements | 2016 | SIGMOD | 4.1945683e-05 |
| 1,064 | Processing Complex Aggregate Queries over Data Streams | 2002 | SIGMOD | 0.00014356481 |
| 3,102 | Processing Set Expressions over Continuous Update Streams | 2003 | SIGMOD | 7.5586568e-05 |