White-box Compression: Learning and Exploiting Compact Table Representations
Summary: White-box compression encodes logical columns as functions over stored physical columns with per-block headers, enabling DBMS-aware optimizations and execution (e.g., predicate push-down). A recursive pattern-driven learner discovers these functions, yielding large compression gains on the Public BI benchmark. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Bogdan Ghiță
- 2. Diego Tomé
- 3. Peter Boncz
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 131 | Integrating Compression and Execution in Column-Oriented Database Systems | 2006 | SIGMOD | 0.0004370331 |
| 241 | DB2 with BLU Acceleration: So Much More than Just a Column Store | 2013 | VLDB | 0.00031420034 |
| 659 | The Making of TPC-DS | 2006 | VLDB | 0.00018500853 |
| 801 | SageDB: A Learned Database System | 2019 | CIDR | 0.00016505496 |
| 1,263 | Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation | 2016 | SIGMOD | 0.00012982857 |
| 2,134 | How to Wring a Table Dry: Entropy Compression of Relations and Querying of Compressed Relations | 2006 | VLDB | 9.4741038e-05 |
| 2,157 | The Data Calculator*: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models | 2018 | SIGMOD | 9.416022e-05 |
| 5,113 | Columnstore and B+ tree – Are Hybrid Physical Designs Important? | 2018 | SIGMOD | 5.687445e-05 |
| 5,670 | Joins on Encoded and Partitioned Data | 2014 | VLDB | 5.3804618e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,408 | Experimental Analysis of Large-scale Learnable Vector Storage Compression | 2024 | VLDB | 4.3441378e-05 |
| 3,745 | DeepSqueeze: Deep Semantic Compression for Tabular Data | 2020 | SIGMOD | 6.7926132e-05 |
| 8,364 | Query Log Compression for Workload Analytics | 2019 | VLDB | 4.5357797e-05 |
| 8,487 | Adaptive Compression for Fast Scans on String Columns | 2021 | SIGMOD | 4.4999394e-05 |
| 9,595 | High-Ratio Compression for Machine-Generated Data | 2023 | SIGMOD | 4.3194469e-05 |
| 6,157 | Compression Aware Physical Database Design | 2011 | VLDB | 5.1801143e-05 |
| 7,429 | CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases | 2022 | SIGMOD | 4.7320139e-05 |
| 4,468 | Comprehensive and Efficient Workload Compression | 2021 | VLDB | 6.1584035e-05 |
| 7,171 | Leveraging Compression in the Tableau Data Engine | 2014 | SIGMOD | 4.8117476e-05 |
| 1,100 | Query Optimization In Compressed Database Systems | 2001 | SIGMOD | 0.00014072277 |