Database Paper Browser

Back to papers

MacroBase: Prioritizing Attention in Fast Data

Summary: MacroBase prioritizes end-user attention in high-volume fast data streams and acts as a search engine for fast data. It delivers accurate, scalable classifications with explanations over groups, powered by a reservoir sampler and heavy-hitters sketch that reach up to 2M events/sec per core and see deployment in telematics. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5294
Venue
SIGMOD
Year
2017
Pagerank
9.4887794e-05
Overall Rank
2,126 | 85.22%
DOI
10.1145/3035918.3035928

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 31 of 31 citing papers.

Rank Citing Paper Year Venue Pagerank
316 NoScope: Optimizing Neural Network Queries over Video at Scale 2017 VLDB 0.00027988668
1,532 Data Management in Machine Learning: Challenges, Techniques, and Systems 2017 SIGMOD 0.00011472681
1,634 Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series 2021 VLDB 0.00011058945
2,154 DIFF: A Relational Interface for Large-Scale Data Explanation 2019 VLDB 9.4208667e-05
2,953 Moment-Based Quantile Sketches for Efficient High Cardinality Aggregation Queries 2018 VLDB 7.8267643e-05
3,319 Sketching Linear Classifiers over Data Streams 2018 SIGMOD 7.226439e-05
4,420 ASAP: Prioritizing Attention via Time Series Smoothing 2017 VLDB 6.2011459e-05
4,453 MorphoSys: Automatic Physical Design Metamorphosis for Distributed Database Systems 2020 VLDB 6.1723632e-05
4,456 AutoOD: Automatic Outlier Detection 2023 SIGMOD 6.1704203e-05
4,554 A Demonstration of AutoOD: A Self-Tuning Anomaly Detection System 2022 VLDB 6.0911296e-05
4,584 Scalable Kernel Density Classification via Threshold-Based Pruning 2017 SIGMOD 6.0668364e-05
4,607 Data Integration and Machine Learning: A Natural Synergy 2018 SIGMOD 6.0538827e-05
4,658 ExplainIt! - A Declarative Root-cause Analysis Engine for Time Series Data 2019 SIGMOD 6.0183783e-05
5,313 XInsight: eXplainable Data Analysis Through The Lens of Causality 2023 SIGMOD 5.573009e-05
6,779 Explaining Inference Queries with Bayesian Optimization 2021 VLDB 4.9280116e-05
6,944 DataPrism: Exposing Disconnect between Data and Systems 2022 SIGMOD 4.8912787e-05
7,243 Data Integration and Machine Learning: A Natural Synergy 2018 VLDB 4.7913666e-05
7,500 Sentinel: Understanding Data Systems 2020 SIGMOD 4.7180617e-05
7,534 Enabling Efficient and General Subpopulation Analytics in Multidimensional Data Streams 2022 VLDB 4.7180004e-05
8,341 BugDoc: Algorithms to Debug Computational Processes 2020 SIGMOD 4.5433282e-05
8,633 Demonstration: MacroBase, A Fast Data Analysis Engine 2017 SIGMOD 4.4802036e-05
8,829 A Distributed System for Large-scale n-gram Language Models at Tencent 2019 VLDB 4.4406886e-05
9,087 A Demonstration of the Exathlon Benchmarking Platform for Explainable Anomaly Detection 2021 VLDB 4.3993112e-05
9,227 Panakos: Chasing the Tails for Multidimensional Data Streams 2023 VLDB 4.3692732e-05
9,533 TSExplain: Surfacing Evolving Explanations for Time Series 2021 SIGMOD 4.3269636e-05
9,709 Outlier Summarization via Human Interpretable Rules 2024 VLDB 4.299267e-05
9,849 Reptile: Aggregation-level Explanations for Hierarchical Data 2022 SIGMOD 4.2721228e-05
10,365 Agree to Disagree: Robust Anomaly Detection with Noisy Labels 2025 SIGMOD 4.1945683e-05
10,875 SDEcho: Efficient Explanation of Aggregated Sequence Difference 2025 VLDB 4.1945683e-05
11,629 Leveraging Organizational Resources to Adapt Models to New Data Modalities 2020 VLDB 4.1945683e-05
11,756 Prioritizing Attention in Fast Data: Principles and Promise 2017 CIDR 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 21 of 21 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
142 TelegraphCQ: Continuous Dataflow Processing for an Uncertain World 2003 CIDR 0.00041725802
181 Mining Frequent Patterns without Candidate Generation 2000 SIGMOD 0.00036992674
191 The Design of the Borealis Stream Processing Engine 2005 CIDR 0.00035738595
210 Gorilla: A Fast, Scalable, In-Memory Time Series Database 2015 VLDB 0.0003404384
214 Scorpion: Explaining Away Outliers in Aggregate Queries 2013 VLDB 0.0003363692
321 MCDB: A Monte Carlo Approach to Managing Uncertain Data 2008 SIGMOD 0.00027527389
323 Gigascope: A Stream Database for Network Applications 2003 SIGMOD 0.00027492196
658 Towards a Unified Architecture for in-RDBMS Analytics 2012 SIGMOD 0.00018506577
704 Building Efficient Query Engines in a High-Level Language 2014 VLDB 0.00017900583
942 A Formal Approach to Finding Explanations for Database Queries 2014 SIGMOD 0.00015155714
1,022 DBSherlock: A Performance Diagnostic Tool for Transactional Databases 2016 SIGMOD 0.00014614917
1,098 Trill: A High-Performance Incremental Query Processor for Diverse Analytics 2015 VLDB 0.00014114442
1,099 Interpretable and Informative Explanations of Outcomes 2015 VLDB 0.00014096312
2,402 Causality and Explanations in Databases 2014 VLDB 8.8928361e-05
2,629 Online Outlier Detection in Sensor Data Using Non-Parametric Models 2006 VLDB 8.4160309e-05
3,105 Data X-Ray: A Diagnostic Tool for Data Errors 2015 SIGMOD 7.5568954e-05
3,171 Interactive Outlier Exploration in Big Data Streams 2014 VLDB 7.4447236e-05
4,350 On Biased Reservoir Sampling in the Presence of Stream Evolution 2006 VLDB 6.2645054e-05
5,755 A Framework for Clustering Uncertain Data 2015 VLDB 5.3402052e-05
8,732 Relative Risk and Odds Ratio: A Data Mining Perspective 2005 PODS 4.4576449e-05
11,756 Prioritizing Attention in Fast Data: Principles and Promise 2017 CIDR 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers