Back to papers
Sample Debiasing in the Themis Open World Database System
Summary: Themis is the first open-world DB that rebalances biased samples to approximate population-wide query results. It blends sample reweighting with Bayesian nets, using a priori population info to beat AQP and baselines with latency and robustness to gaps.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 5819
- Venue
- SIGMOD
- Year
- 2020
- Pagerank
- 6.2427076e-05
- Overall Rank
- 4,375 | 69.57%
- DOI
-
10.1145/3318464.3380606
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 48 |
Data Integration: A Theoretical Perspective |
2002 |
PODS |
0.00069720859 |
| 71 |
How Good Are Query Optimizers, Really? |
2016 |
VLDB |
0.00059038975 |
| 372 |
Selectivity Estimation using Probabilistic Models |
2001 |
SIGMOD |
0.00025354779 |
| 469 |
MauveDB: Supporting Model-based User Views in Database Systems |
2006 |
SIGMOD |
0.00022406923 |
| 1,574 |
Approximate Query Processing: No Silver Bullet |
2017 |
SIGMOD |
0.00011287495 |
| 1,981 |
Improved Selectivity Estimation by Combining Knowledge from Sampling and Synopses |
2018 |
VLDB |
9.8687545e-05 |
| 2,129 |
IDEBench: A Benchmark for Interactive Data Exploration |
2020 |
SIGMOD |
9.480002e-05 |
| 2,184 |
A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data |
2014 |
SIGMOD |
9.3429789e-05 |
| 2,365 |
The Analytical Bootstrap: a New Method for Fast Error Estimation in Approximate Query Processing |
2014 |
SIGMOD |
8.9551432e-05 |
| 2,386 |
Leveraging Aggregate Constraints For Deduplication |
2007 |
SIGMOD |
8.9231895e-05 |
| 2,580 |
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee |
2016 |
SIGMOD |
8.5058814e-05 |
| 2,588 |
Database Learning: Toward a Database that Becomes Smarter Every Time |
2017 |
SIGMOD |
8.4909562e-05 |
| 3,944 |
AQP++: Connecting Approximate Query Processing With Aggregate Precomputation for Interactive Analytics |
2018 |
SIGMOD |
6.6078243e-05 |
| 4,030 |
Revisiting Reuse for Approximate Query Processing |
2017 |
VLDB |
6.5129665e-05 |
| 5,872 |
Differential Privacy and the US Census |
2019 |
PODS |
5.2937278e-05 |
| 7,872 |
Probabilistic Database Summarization for Interactive Data Exploration |
2017 |
VLDB |
4.6307184e-05 |
| 8,102 |
NetCube: A Scalable Tool for Fast Data Mining and Compression |
2001 |
VLDB |
4.5852446e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 7,572 |
Pushing the Boundaries of Crowd-enabled Databases with Query-driven Schema Expansion |
2012 |
VLDB |
4.7075553e-05 |
| 4,521 |
A Temporal-Probabilistic Database Model for Information Extraction |
2013 |
VLDB |
6.1168322e-05 |
| 5,214 |
ThalamusDB: Approximate Query Processing on Multi-Modal Data |
2024 |
SIGMOD |
5.624434e-05 |
| 12,025 |
A Social Network Database that Learns How to Answer Queries |
2013 |
CIDR |
4.1945683e-05 |
| 2,118 |
Using Probabilistic Models for Data Management in Acquisitional Environments |
2005 |
CIDR |
9.5100739e-05 |
| 3,081 |
Knowledge Expansion over Probabilistic Knowledge Bases |
2014 |
SIGMOD |
7.6031501e-05 |
| 4,350 |
On Biased Reservoir Sampling in the Presence of Stream Evolution |
2006 |
VLDB |
6.2645054e-05 |
| 9,204 |
Themis: A GPU-accelerated Relational Query Execution Engine |
2025 |
VLDB |
4.3737475e-05 |
| 8,337 |
THEMIS: Fairness in Federated Stream Processing under Overload |
2016 |
SIGMOD |
4.5434623e-05 |
| 6,233 |
Mosaic: A Sample-Based Database System for Open World Query Processing |
2020 |
CIDR |
5.1451876e-05 |