Enabling Efficient and General Subpopulation Analytics in Multidimensional Data Streams
Summary: Hydra enables real-time, general subpopulation analytics on multidimensional streams with a 'sketch of sketches' and universal sketching to bound errors across combinatorial subpopulations. Spark plugin implementation minimizes overhead and memory, delivering interactive estimates with order-of-magnitude gains versus Spark/Druid. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Antonis Manousis
- 2. Zhuo Cheng
- 3. Ran Ben Basat
- 4. Zaoxing Liu
- 5. Vyas Sekar
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,038 | OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates | 2024 | VLDB | 4.4039656e-05 |
| 10,315 | CounterSnake: A lossless and generalized compression framework for diverse sketches | 2026 | VLDB | 4.1945683e-05 |
| 10,608 | Approximation-First Timeseries Query At Scale | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 40 of 40 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,051 | Building Advanced SQL Analytics From Low-Level Plan Operators | 2021 | SIGMOD | 4.5969549e-05 |
| 10,208 | Scalable Clustering Over High Dimensional Vector Streams | 2026 | SIGMOD | 4.1945683e-05 |
| 7,895 | HYDRA: A Dynamic Big Data Regenerator | 2018 | VLDB | 4.623701e-05 |
| 1,548 | Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark | 2018 | SIGMOD | 0.00011431383 |
| 662 | A Framework for Clustering Evolving Data Streams | 2003 | VLDB | 0.00018475968 |
| 9,068 | A Framework for Projected Clustering of High Dimensional Data Streams | 2004 | VLDB | 4.4034035e-05 |
| 2,953 | Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries | 2018 | VLDB | 7.8267643e-05 |
| 9,264 | Model-Parallel Model Selection for Deep Learning Systems | 2021 | SIGMOD | 4.3675421e-05 |
| 9,504 | Supporting Scalable Analytics with Latency Constraints | 2015 | VLDB | 4.3341665e-05 |
| 9,038 | OmniSketch: Efficient Multi-Dimensional High-Velocity Stream Analytics with Arbitrary Predicates | 2024 | VLDB | 4.4039656e-05 |