Database Paper Browser

Back to papers

Good to the Last Bit: Data-Driven Encoding with CodecDB

Summary: CodecDB is an encoding-aware columnar DB that tightly couples data-driven encoding selection with encoding-aware query operators to exploit encoded data. It attains ~90% encoding accuracy, up to 40% better compression, and ~10x TPC-H speedups and ~3x SSB speedups. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6174
Venue
SIGMOD
Year
2021
Pagerank
5.0941072e-05
Overall Rank
6,367 | 55.71%
DOI
10.1145/3448016.3457283

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 22 of 22 citing papers.

Rank Citing Paper Year Venue Pagerank
2,381 TSB-UAD: An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection 2022 VLDB 8.9327638e-05
3,416 LeCo: Lightweight Compression via Learning Serial Correlations 2024 SIGMOD 7.1196234e-05
3,943 Volume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection 2022 VLDB 6.6099833e-05
4,079 Choose Wisely: An Extensive Evaluation of Model Selection for Anomaly Detection in Time Series 2023 VLDB 6.4663636e-05
4,514 An Empirical Evaluation of Columnar Storage Formats 2024 VLDB 6.1204636e-05
5,562 A Deep Dive into Common Open Formats for Analytical DBMSs 2023 VLDB 5.4331334e-05
8,578 Robust and Budget-Constrained Encoding Configurations for In-Memory Database Systems 2022 VLDB 4.4923477e-05
9,294 Theseus: Navigating the Labyrinth of Time-Series Anomaly Detection 2022 VLDB 4.3608061e-05
9,329 Odyssey: An Engine Enabling The Time-Series Clustering Journey 2023 VLDB 4.3556432e-05
9,599 SPARTAN: Data-Adaptive Symbolic Time-Series Approximation 2025 SIGMOD 4.3177432e-05
9,645 The FastLanes File Format 2025 VLDB 4.3109001e-05
9,906 Rethinking the Encoding of Integers for Scans on Skewed Data 2023 SIGMOD 4.2578595e-05
10,281 GPU Acceleration of SQL Analytics on Compressed Data 2026 VLDB 4.1945683e-05
10,466 A Structured Study of Multivariate Time-Series Distance Measures 2025 SIGMOD 4.1945683e-05
10,524 Understanding the Black Box: A Deep Empirical Dive into Shapley Value Approximations for Tabular Data 2025 SIGMOD 4.1945683e-05
10,674 Improving Time Series Data Compression in Apache IoTDB 2025 VLDB 4.1945683e-05
10,738 TSB-AutoAD: Towards Automated Solutions for Time-Series Anomaly Detection 2025 VLDB 4.1945683e-05
10,739 Time-Series Clustering: A Comprehensive Study of Data Mining, Machine Learning, and Deep Learning Methods 2025 VLDB 4.1945683e-05
10,741 Beyond Compression: A Comprehensive Evaluation of Lossless Floating-Point Compression 2025 VLDB 4.1945683e-05
11,094 Time-Series Anomaly Detection: Overview and New Trends 2024 VLDB 4.1945683e-05
11,224 Homomorphic Compression: Making Text Processing on Compression Unlimited 2023 SIGMOD 4.1945683e-05
11,235 Accelerating Similarity Search for Elastic Measures: A Study and New Generalization of Lower Bounding Distances 2023 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 18 of 18 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
3 Pig Latin: A Not-So-Foreign Language for Data Processing 2008 SIGMOD 0.0024183614
21 C-Store: A Column-oriented DBMS 2005 VLDB 0.00086087497
131 Integrating Compression and Execution in Column-Oriented Database Systems 2006 SIGMOD 0.0004370331
305 SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units 2009 VLDB 0.00028248614
476 Impala: A Modern, Open-Source SQL Engine for Hadoop 2015 CIDR 0.00022226941
898 Data Compression Support in Databases 1994 VLDB 0.00015525779
1,100 Query Optimization In Compressed Database Systems 2001 SIGMOD 0.00014072277
1,263 Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation 2016 SIGMOD 0.00012982857
1,270 BitWeaving: Fast Scans for Main Memory Data Processing 2013 SIGMOD 0.00012926086
2,693 An Architecture for Recycling Intermediates in a Column-store 2009 SIGMOD 8.2883398e-05
2,856 Efficient Index Compression in DB2 LUW 2009 VLDB 8.0056412e-05
4,602 Accelerating Raw Data Analysis with the ACCORDA Software and Hardware Architecture 2019 VLDB 6.0567387e-05
5,236 Online Deduplication for Databases 2017 SIGMOD 5.611324e-05
5,835 Order-Preserving Key Compression for In-Memory Search Trees 2020 SIGMOD 5.30905e-05
6,157 Compression Aware Physical Database Design 2011 VLDB 5.1801143e-05
6,311 VergeDB: A Database for IoT Analytics on Edge Devices 2021 CIDR 5.1161316e-05
7,335 MorphStore: Analytical Query Engine with a Holistic Compression-Enabled Processing Model 2020 VLDB 4.7603723e-05
8,088 PIDS: Attribute Decomposition for Improved Compression and Query Performance in Columnar Storage 2020 VLDB 4.5897316e-05
Previous Page 1 / 1 Next

Semantically Similar Papers