Database Paper Browser

Back to papers

Instance-Optimized Data Layouts for Cloud Analytics Workloads

Summary: Introducing MTO, an instance-optimized data-layout framework that jointly blocks across all tables in multi-table cloud workloads (star/snowflake schemas) to maximize block skipping. Leveraging sideways information from joins, it beats single-table layouts, with up to 93% fewer blocks accessed and 75% faster end-to-end queries on a commercial service. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6161
Venue
SIGMOD
Year
2021
Pagerank
6.7747205e-05
Overall Rank
3,779 | 73.72%
DOI
10.1145/3448016.3457270

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 26 of 26 citing papers.

Rank Citing Paper Year Venue Pagerank
2,320 High-Throughput Vector Similarity Search in Knowledge Graphs 2023 SIGMOD 9.0366225e-05
5,765 Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries 2024 CIDR 5.336442e-05
6,279 Self-Organizing Data Containers 2022 CIDR 5.1295282e-05
6,297 Towards instance-optimized data systems 2021 VLDB 5.1227886e-05
6,302 Diva: Making MVCC Systems HTAP-Friendly 2022 SIGMOD 5.1215989e-05
6,466 Pando: Enhanced Data Skipping with Logical Data Partitioning 2023 VLDB 5.0528281e-05
6,803 Proteus: Autonomous Adaptive Storage for Mixed Workloads 2022 SIGMOD 4.9224958e-05
6,972 Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses 2024 SIGMOD 4.8785237e-05
7,990 Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD 2024 VLDB 4.6117441e-05
8,020 The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions 2024 VLDB 4.6040862e-05
8,225 Automated Multidimensional Data Layouts in Amazon Redshift 2024 SIGMOD 4.555289e-05
8,415 Pruning in Snowflake: Working Smarter, Not Harder 2025 SIGMOD 4.5197687e-05
8,442 SageDB: An Instance-Optimized Data Analytics System 2022 VLDB 4.5120602e-05
8,781 Accelerate Distributed Joins with Predicate Transfer 2025 SIGMOD 4.4534753e-05
9,641 An Experimental Comparison of Tree-data Structures for Connectivity Queries on Fully-dynamic Undirected Graphs 2025 SIGMOD 4.3109001e-05
9,760 Adaptive data transformations for QaaS 2025 CIDR 4.2856106e-05
9,827 PLATON: Top-down R-tree Packing with Learned Partition Policy 2023 SIGMOD 4.2751057e-05
9,917 Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes 2023 VLDB 4.2561557e-05
10,141 Honeybee: Efficient Role-based Access Control for Vector Databases via Dynamic Partitioning 2026 SIGMOD 4.1945683e-05
10,217 This is Going to Sound Crazy, But What If We Used Large Language Models to Boost Automatic Database Tuning Algorithms By Leveraging Prior History? We Will Find Better Configurations More Quickly Than Retraining From Scratch! 2026 SIGMOD 4.1945683e-05
10,230 Breaking the Isolation-Freshness Trade-off: Joint Adaptive Storage Optimization for HTAP Systems 2026 VLDB 4.1945683e-05
10,385 Optimizing Block Skipping for High-Dimensional Data with Learned Adaptive Curve 2025 SIGMOD 4.1945683e-05
10,935 Automated Clustering Recommendation With Database Zone Maps 2024 SIGMOD 4.1945683e-05
11,067 Partition, Don’t Sort! Compression Boosters for Cloud Data Ingestion Pipelines 2024 VLDB 4.1945683e-05
11,151 Data Pipes: Declarative Control over Data Movement 2023 CIDR 4.1945683e-05
11,267 Anser: Adaptive Information Sharing Framework of AnalyticDB 2023 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 37 of 37 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
7 Optimal Aggregation Algorithms for Middleware [Extended Abstract] 2001 PODS 0.0015496097
102 The Case for Learned Index Structures 2018 SIGMOD 0.00049545203
183 Automatic Database Management System Tuning Through Large-scale Machine Learning 2017 SIGMOD 0.00036721403
204 Learned Cardinalities: Estimating Correlated Joins with Deep Learning 2019 CIDR 0.00034784455
209 Schism: a Workload-Driven Approach to Database Replication and Partitioning 2010 VLDB 0.00034468292
285 Automating Physical Database Design in a Parallel Database 2002 SIGMOD 0.0002899128
333 Neo: A Learned Query Optimizer 2019 VLDB 0.00027206884
679 Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems 2012 SIGMOD 0.00018215154
716 Query-based Workload Forecasting for Self-Driving Database Management Systems 2018 SIGMOD 0.00017723171
758 Deep Unsupervised Cardinality Estimation 2020 VLDB 0.0001706608
801 SageDB: A Learned Database System 2019 CIDR 0.00016505496
826 ALEX: An Updatable Adaptive Learned Index 2020 SIGMOD 0.00016224841
910 NeuroCard: One Cardinality Estimator for All Tables 2021 VLDB 0.00015423056
1,223 Enhancements to SQL Server Column Stores 2013 SIGMOD 0.00013207641
1,254 Selectivity Estimation for Range Predicates using Lightweight Models 2019 VLDB 0.00013027411
1,313 Cost-Based Optimization for Magic: Algebra and Implementation 1996 SIGMOD 0.0001263831
1,375 FITing-Tree: A Data-aware Index Structure 2019 SIGMOD 0.00012303141
1,477 Fine-grained Partitioning for Aggressive Data Skipping 2014 SIGMOD 0.00011770865
1,478 Learning Multi-dimensional Indexes 2020 SIGMOD 0.00011762542
1,582 Execution Strategies for SQL Subqueries 2007 SIGMOD 0.00011265079
1,611 Qd-tree: Learning Data Layouts for Big Data Analytics 2020 SIGMOD 0.00011147324
1,889 Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads 2021 VLDB 0.00010200865
2,115 LISA: A Learned Index Structure for Spatial Data 2020 SIGMOD 9.5257379e-05
2,157 The Data Calculator*: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models 2018 SIGMOD 9.416022e-05
2,413 Automated Partitioning Design in Parallel Database Systems 2011 SIGMOD 8.8672223e-05
2,772 Quickstep: A Data Platform Based on the Scaling-Up Approach 2018 VLDB 8.1401661e-05
2,862 An Experimental Study of Bitmap Compression vs. Inverted List Compression 2017 SIGMOD 7.9898539e-05
2,865 Designing Succinct Secondary Indexing Mechanism by Exploiting Column Correlations 2019 SIGMOD 7.9862595e-05
2,983 Supporting Table Partitioning By Reference in Oracle 2008 SIGMOD 7.7796493e-05
3,076 Learning a Partitioning Advisor for Cloud Databases 2020 SIGMOD 7.6107677e-05
3,488 Optimal Column Layout for Hybrid Workloads 2019 VLDB 7.0479329e-05
3,653 Database Tuning Advisor for Microsoft SQL Server 2005: Demo 2005 SIGMOD 6.8743355e-05
3,658 Towards a Hands-Free Query Optimizer through Deep Learning 2019 CIDR 6.8704209e-05
3,737 Skipping-oriented Partitioning for Columnar Layouts 2017 VLDB 6.8033227e-05
3,922 Pushing Data-Induced Predicates Through Joins in Big-Data Clusters 2020 VLDB 6.6291079e-05
5,118 AdaptDB: Adaptive Partitioning for Distributed Joins 2017 VLDB 5.6820984e-05
6,493 Joins on Samples: A Theoretical Guide for Practitioners 2020 VLDB 5.0424713e-05
Previous Page 1 / 1 Next

Semantically Similar Papers