Database Paper Browser

Back to papers

A Deep Dive into Common Open Formats for Analytical DBMSs

Summary: Systematic evaluation of Arrow, Parquet, and ORC against OLAP DBMS requirements, showing how layout, vectorization, compression, metadata, and mmap trade-offs affect query efficiency and integration. Identifies co‑design opportunities for unified in‑memory/on‑disk representation and practical guidance for implementers. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13144
Venue
VLDB
Year
2023
Pagerank
5.4331334e-05
Overall Rank
5,562 | 61.31%
DOI
10.14778/3611479.3611507

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 11 of 11 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 26 of 26 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
109 Dremel: Interactive Analysis of Web-Scale Datasets 2010 VLDB 0.00048186983
131 Integrating Compression and Execution in Column-Oriented Database Systems 2006 SIGMOD 0.0004370331
426 Amazon Redshift and the Case for Simpler Data Warehouses 2015 SIGMOD 0.00023594359
497 Column-Stores vs. Row-Stores: How Different Are They Really? 2008 SIGMOD 0.00021716559
659 The Making of TPC-DS 2006 VLDB 0.00018500853
1,270 BitWeaving: Fast Scans for Main Memory Data Processing 2013 SIGMOD 0.00012926086
1,377 Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics 2021 CIDR 0.00012296941
1,611 Qd-tree: Learning Data Layouts for Big Data Analytics 2020 SIGMOD 0.00011147324
2,127 SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures 2014 VLDB 9.4863172e-05
2,258 SQL Server Column Store Indexes 2011 SIGMOD 9.1678883e-05
2,528 Velox: Meta’s Unified Execution Engine 2022 VLDB 8.59454e-05
2,613 Decomposed Bounded Floats for Fast Compression and Queries 2021 VLDB 8.4503824e-05
3,038 Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics 2017 SIGMOD 7.6717218e-05
3,608 Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation 2018 SIGMOD 6.924272e-05
4,514 An Empirical Evaluation of Columnar Storage Formats 2024 VLDB 6.1204636e-05
4,667 FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS 2021 VLDB 6.0116919e-05
5,123 Accelerating Generalized Linear Models with MLWeaving: A One-Size-Fits-All System for Any-Precision Learning 2019 VLDB 5.6796998e-05
5,318 Analyzing and Comparing Lakehouse Storage Systems 2023 CIDR 5.5715872e-05
5,898 Column Partition and Permutation for Run Length Encoding in Columnar Databases 2020 SIGMOD 5.2839046e-05
6,279 Self-Organizing Data Containers 2022 CIDR 5.1295282e-05
6,311 VergeDB: A Database for IoT Analytics on Edge Devices 2021 CIDR 5.1161316e-05
6,367 Good to the Last Bit: Data-Driven Encoding with CodecDB 2021 SIGMOD 5.0941072e-05
6,666 Mainlining Databases: Supporting Fast Transactional Workloads on Universal Columnar Data File Formats 2021 VLDB 4.9691571e-05
7,128 Jigsaw: A Data Storage and Query Processing Engine for Irregular Table Partitioning 2021 SIGMOD 4.8230171e-05
7,429 CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases 2022 SIGMOD 4.7320139e-05
8,088 PIDS: Attribute Decomposition for Improved Compression and Query Performance in Columnar Storage 2020 VLDB 4.5897316e-05
Previous Page 1 / 1 Next

Semantically Similar Papers