Probabilistic Database Summarization for Interactive Data Exploration
Summary: Max-Entropy probabilistic summarization yields a compact, queryable representation. Theory and three optimizations accelerate preprocessing and improve accuracy; linear queries have error near sampling, with rare-value discrimination. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Laurel Orr
- 2. Magdalena Balazinska
- 3. Dan Suciu
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,375 | Sample Debiasing in the Themis Open World Database System | 2020 | SIGMOD | 6.2427076e-05 |
| 9,262 | SubTab: Data Exploration with Informative Sub-Tables | 2022 | SIGMOD | 4.368964e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 10 of 10 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 14 | Online Aggregation | 1997 | SIGMOD | 0.0010801504 |
| 742 | Optimizing Linear Counting Queries Under Differential Privacy | 2010 | PODS | 0.00017360873 |
| 842 | Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data | 2001 | SIGMOD | 0.00016031973 |
| 1,260 | Dynamic Sample Selection for Approximate Query Processing | 2003 | SIGMOD | 0.00012993347 |
| 1,323 | Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters | 2016 | SIGMOD | 0.00012601997 |
| 1,425 | Scalable Approximate Query Processing With The DBO Engine | 2007 | SIGMOD | 0.00012051353 |
| 2,356 | Consistently Estimating the Selectivity of Conjuncts of Predicates | 2005 | VLDB | 8.9620762e-05 |
| 2,580 | Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee | 2016 | SIGMOD | 8.5058814e-05 |
| 2,808 | A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries | 2001 | SIGMOD | 8.0870741e-05 |
| 5,977 | Understanding Cardinality Estimation using Entropy Maximization | 2010 | PODS | 5.2455909e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 4,983 | Querying Probabilistic Information Extraction | 2010 | VLDB | 5.7870787e-05 |
| 8,090 | Probabilistic Histograms for Probabilistic Data | 2009 | VLDB | 4.5888589e-05 |
| 4,521 | A Temporal-Probabilistic Database Model for Information Extraction | 2013 | VLDB | 6.1168322e-05 |
| 1,992 | Probabilistic Ranking of Database Query Results | 2004 | VLDB | 9.8462684e-05 |
| 7,914 | Efficient Approximate Algorithms for Empirical Entropy and Mutual Information | 2021 | SIGMOD | 4.6179608e-05 |
| 760 | Creating Probabilistic Databases from Information Extraction Models | 2006 | VLDB | 0.00017053935 |
| 4,030 | Revisiting Reuse for Approximate Query Processing | 2017 | VLDB | 6.5129665e-05 |
| 4,758 | Optimization for Active Learning-based Interactive Database Exploration | 2019 | VLDB | 5.9422515e-05 |
| 4,614 | Interactive Summarization and Exploration of Top Aggregate Query Answers | 2018 | VLDB | 6.0467204e-05 |
| 74 | Efficient Query Evaluation on Probabilistic Databases | 2004 | VLDB | 0.00057857292 |