Database Paper Browser

Back to papers

BtrBlocks: Efficient Columnar Compression for Data Lakes

Summary: BtrBlocks is an open columnar format optimized for data lakes on object stores, addressing remote-scan inefficiencies in Parquet-based pipelines. Lightweight encodings yield fast decompression and strong compression, lowering CPU-bound query latency for cloud analytics. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6621
Venue
SIGMOD
Year
2023
Pagerank
6.8854928e-05
Overall Rank
3,644 | 74.66%
DOI
10.1145/3589263

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 24 of 24 citing papers.

Rank Citing Paper Year Venue Pagerank
4,507 ALP: Adaptive Lossless floating-Point Compression 2023 SIGMOD 6.131017e-05
4,514 An Empirical Evaluation of Columnar Storage Formats 2024 VLDB 6.1204636e-05
7,469 Bullion: A Column Store for Machine Learning 2025 CIDR 4.7204398e-05
7,876 Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAP 2024 VLDB 4.6298182e-05
8,698 Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask 2024 VLDB 4.4657846e-05
9,201 F3: The Open-Source Data File Format for the Future 2026 SIGMOD 4.3743539e-05
9,645 The FastLanes File Format 2025 VLDB 4.3109001e-05
9,701 Towards Functional Decomposition of Storage Formats 2025 CIDR 4.3008468e-05
9,901 AnyBlox: A Framework for Self-Decoding Datasets 2025 VLDB 4.258022e-05
9,975 Cloudspecs: Cloud Hardware Evolution Through the Looking Glass 2026 CIDR 4.1945683e-05
9,980 Declarative Memory Services 2026 CIDR 4.1945683e-05
10,193 Predictive Translation: High-Performance Buffer Management Without the Trade-Offs 2026 SIGMOD 4.1945683e-05
10,220 FlatStor: An Efficient Embedded-Index Based Columnar Data Layout for Multimodal Data Workloads 2026 VLDB 4.1945683e-05
10,248 Active Data Lakes: Regaining Physical Data Independence Without Losing Interoperability 2026 VLDB 4.1945683e-05
10,281 GPU Acceleration of SQL Analytics on Compressed Data 2026 VLDB 4.1945683e-05
10,291 Morphing-based Compression for Data-centric ML Pipelines 2026 VLDB 4.1945683e-05
10,321 DeXOR: Enabling xor in Decimal Space for Streaming Lossless Compression of Floating-point Data 2026 VLDB 4.1945683e-05
10,415 SAP HANA Cloud: Data Management for Modern Enterprise Applications 2025 SIGMOD 4.1945683e-05
10,494 Nested Parquet Is Flat, Why Not Use It? How To Scan Nested Data With On-the-Fly Key Generation and Joins 2025 SIGMOD 4.1945683e-05
10,741 Beyond Compression: A Comprehensive Evaluation of Lossless Floating-Point Compression 2025 VLDB 4.1945683e-05
10,767 The HANA Native Query Engine for Lakehouse Systems 2025 VLDB 4.1945683e-05
10,854 LiquidCache: Efficient Pushdown Caching for Cloud-Native Data Analytics 2025 VLDB 4.1945683e-05
10,989 High-Performance Query Processing with NVMe Arrays: Spilling without Killing Performance 2024 SIGMOD 4.1945683e-05
11,036 Blitzcrank: Fast Semantic Compression for In-memory Online Transaction Processing 2024 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 26 of 26 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
35 MonetDB/X100: Hyper-Pipelining Query Execution 2005 CIDR 0.00076197749
60 Efficiently Compiling Efficient Query Plans for Modern Hardware 2011 VLDB 0.00064439773
71 How Good Are Query Optimizers, Really? 2016 VLDB 0.00059038975
109 Dremel: Interactive Analysis of Web-Scale Datasets 2010 VLDB 0.00048186983
131 Integrating Compression and Execution in Column-Oriented Database Systems 2006 SIGMOD 0.0004370331
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
210 Gorilla: A Fast, Scalable, In-Memory Time Series Database 2015 VLDB 0.0003404384
241 DB2 with BLU Acceleration: So Much More than Just a Column Store 2013 VLDB 0.00031420034
305 SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units 2009 VLDB 0.00028248614
426 Amazon Redshift and the Case for Simpler Data Warehouses 2015 SIGMOD 0.00023594359
1,223 Enhancements to SQL Server Column Stores 2013 SIGMOD 0.00013207641
1,263 Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation 2016 SIGMOD 0.00012982857
1,270 BitWeaving: Fast Scans for Main Memory Data Processing 2013 SIGMOD 0.00012926086
1,284 Amazon Redshift Re-invented 2022 SIGMOD 0.00012837822
1,377 Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics 2021 CIDR 0.00012296941
1,618 Row-wise Parallel Predicate Evaluation 2008 VLDB 0.00011114015
2,062 Dremel: A Decade of Interactive SQL Analysis at Web Scale 2020 VLDB 9.6481955e-05
2,064 Chimp: Efficient Lossless Floating Point Compression for Time Series Databases 2022 VLDB 9.6418929e-05
2,258 SQL Server Column Store Indexes 2011 SIGMOD 9.1678883e-05
2,390 ByteSlice: Pushing the Envelop of Main Memory Data Processing with a New Storage Layout 2015 SIGMOD 8.9084657e-05
2,473 Photon: A Fast Query Engine for Lakehouse Systems 2022 SIGMOD 8.7237281e-05
2,545 POLARIS: The Distributed SQL Engine in Azure Synapse 2020 VLDB 8.5725413e-05
2,568 Towards Cost-Optimal Query Processing in the Cloud 2021 VLDB 8.5239227e-05
2,862 An Experimental Study of Bitmap Compression vs. Inverted List Compression 2017 SIGMOD 7.9898539e-05
3,787 White-box Compression: Learning and Exploiting Compact Table Representations 2020 CIDR 6.7674374e-05
4,717 Cloud Analytics Benchmark 2023 VLDB 5.9751539e-05
Previous Page 1 / 1 Next

Semantically Similar Papers