Database Paper Browser

Back to papers

Qd-tree: Learning Data Layouts for Big Data Analytics

Summary: qd-tree: learning-based data layouts route records to storage blocks, minimizing I/O for analytics. Two methods, greedy and deep RL, build the qd-tree, delivering large I/O speedups over blocking and near 2× data-skipping lower bound, with semantic block descriptions. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5985
Venue
SIGMOD
Year
2020
Pagerank
0.00011147324
Overall Rank
1,611 | 88.80%
DOI
10.1145/3318464.3389770

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 44 of 44 citing papers.

Rank Citing Paper Year Venue Pagerank
826 ALEX: An Updatable Adaptive Learned Index 2020 SIGMOD 0.00016224841
910 NeuroCard: One Cardinality Estimator for All Tables 2021 VLDB 0.00015423056
1,407 DB-BERT: A Database Tuning Tool that "Reads the Manual" 2022 SIGMOD 0.00012146739
1,889 Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads 2021 VLDB 0.00010200865
2,320 High-Throughput Vector Similarity Search in Knowledge Graphs 2023 SIGMOD 9.0366225e-05
2,552 Updatable Learned Index with Precise Positions 2021 VLDB 8.5530411e-05
2,678 Effectively Learning Spatial Indices 2020 VLDB 8.3252088e-05
3,779 Instance-Optimized Data Layouts for Cloud Analytics Workloads 2021 SIGMOD 6.7747205e-05
4,084 APEX: A High-Performance Learned Index on Persistent Memory 2022 VLDB 6.4622113e-05
4,128 Are Updatable Learned Indexes Ready? 2022 VLDB 6.4292373e-05
4,913 UDO: Universal Database Optimization using Reinforcement Learning 2021 VLDB 5.8316231e-05
5,074 Learned Index: A Comprehensive Experimental Evaluation 2023 VLDB 5.7175726e-05
5,562 A Deep Dive into Common Open Formats for Analytical DBMSs 2023 VLDB 5.4331334e-05
5,572 The RLR-Tree: A Reinforcement Learning Based R-Tree for Spatial Data 2023 SIGMOD 5.4277273e-05
5,671 LSched: A Workload-Aware Learned Query Scheduler for Analytical Database Systems 2022 SIGMOD 5.3803919e-05
6,279 Self-Organizing Data Containers 2022 CIDR 5.1295282e-05
6,297 Towards instance-optimized data systems 2021 VLDB 5.1227886e-05
6,302 Diva: Making MVCC Systems HTAP-Friendly 2022 SIGMOD 5.1215989e-05
6,466 Pando: Enhanced Data Skipping with Logical Data Partitioning 2023 VLDB 5.0528281e-05
6,803 Proteus: Autonomous Adaptive Storage for Mixed Workloads 2022 SIGMOD 4.9224958e-05
6,972 Predicate Caching: Query-Driven Secondary Indexing for Cloud Data Warehouses 2024 SIGMOD 4.8785237e-05
6,984 Replicated Layout for In-Memory Database Systems 2022 VLDB 4.873081e-05
7,042 LMSFC: A Novel Multidimensional Index based on Learned Monotonic Space Filling Curves 2023 VLDB 4.8541986e-05
7,128 Jigsaw: A Data Storage and Query Processing Engine for Irregular Table Partitioning 2021 SIGMOD 4.8230171e-05
8,225 Automated Multidimensional Data Layouts in Amazon Redshift 2024 SIGMOD 4.555289e-05
8,405 Towards Designing and Learning Piecewise Space-Filling Curves 2023 VLDB 4.5224126e-05
8,415 Pruning in Snowflake: Working Smarter, Not Harder 2025 SIGMOD 4.5197687e-05
8,417 The Case for Learned In-Memory Joins 2023 VLDB 4.5194164e-05
8,442 SageDB: An Instance-Optimized Data Analytics System 2022 VLDB 4.5120602e-05
8,636 WISK: A Workload-aware Learned Index for Spatial Keyword Queries 2023 SIGMOD 4.4801284e-05
8,645 Predicate Pushdown for Data Science Pipelines 2023 SIGMOD 4.4772518e-05
8,786 AWARE: Workload-aware, Redundancy-exploiting Linear Algebra 2023 SIGMOD 4.4521262e-05
9,760 Adaptive data transformations for QaaS 2025 CIDR 4.2856106e-05
9,767 Adaptive Indexing of Objects with Spatial Extent 2023 VLDB 4.2856106e-05
9,827 PLATON: Top-down R-tree Packing with Learned Partition Policy 2023 SIGMOD 4.2751057e-05
10,141 Honeybee: Efficient Role-based Access Control for Vector Databases via Dynamic Partitioning 2026 SIGMOD 4.1945683e-05
10,180 LM-Tree: A Hybrid Learned Index for Similarity Search in Metric Spaces 2026 SIGMOD 4.1945683e-05
10,230 Breaking the Isolation-Freshness Trade-off: Joint Adaptive Storage Optimization for HTAP Systems 2026 VLDB 4.1945683e-05
10,385 Optimizing Block Skipping for High-Dimensional Data with Learned Adaptive Curve 2025 SIGMOD 4.1945683e-05
10,761 SIEVE: Effective Filtered Vector Search with Collection of Indexes 2025 VLDB 4.1945683e-05
10,935 Automated Clustering Recommendation With Database Zone Maps 2024 SIGMOD 4.1945683e-05
10,980 BT-Tree: A Reinforcement Learning Based Index for Big Trajectory Data 2024 SIGMOD 4.1945683e-05
11,212 SH2O: Efficient Data Access for Work-Sharing Databases 2023 SIGMOD 4.1945683e-05
11,276 Route Travel Time Estimation on A Road Network Revisited: Heterogeneity, Proximity, Periodicity and Dynamicity 2023 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 30 of 30 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
2 R-Trees: A Dynamic Index Structure For Spatial Searching 1984 SIGMOD 0.0032169493
102 The Case for Learned Index Structures 2018 SIGMOD 0.00049545203
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
183 Automatic Database Management System Tuning Through Large-scale Machine Learning 2017 SIGMOD 0.00036721403
209 Schism: a Workload-Driven Approach to Database Replication and Partitioning 2010 VLDB 0.00034468292
258 DB2 Design Advisor: Integrated Automatic Physical Database Design 2004 VLDB 0.0003022091
285 Automating Physical Database Design in a Parallel Database 2002 SIGMOD 0.0002899128
286 Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design 2004 SIGMOD 0.00028990057
333 Neo: A Learned Query Optimizer 2019 VLDB 0.00027206884
368 Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing 1998 VLDB 0.000254931
408 Database Cracking 2007 CIDR 0.00023953844
679 Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems 2012 SIGMOD 0.00018215154
716 Query-based Workload Forecasting for Self-Driving Database Management Systems 2018 SIGMOD 0.00017723171
758 Deep Unsupervised Cardinality Estimation 2020 VLDB 0.0001706608
826 ALEX: An Updatable Adaptive Learned Index 2020 SIGMOD 0.00016224841
1,223 Enhancements to SQL Server Column Stores 2013 SIGMOD 0.00013207641
1,254 Selectivity Estimation for Range Predicates using Lightweight Models 2019 VLDB 0.00013027411
1,477 Fine-grained Partitioning for Aggressive Data Skipping 2014 SIGMOD 0.00011770865
1,478 Learning Multi-dimensional Indexes 2020 SIGMOD 0.00011762542
1,700 Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads 2016 SIGMOD 0.00010858865
2,363 Merging What’s Cracked, Cracking What’s Merged: Adaptive Indexing in Main-Memory Column-Stores 2011 VLDB 8.9580928e-05
2,606 Design Continuums and the Path Toward Self-Designing Key-Value Stores that Know and Learn 2019 CIDR 8.4645832e-05
2,787 To Tune or not to Tune? A Lightweight Physical Design Alerter 2006 VLDB 8.1263608e-05
3,005 Clay: Fine-Grained Adaptive Partitioning for General Database Schemas 2017 VLDB 7.7303579e-05
3,028 Efficient Query Processing for Multi-Dimensionally Clustered Tables in DB2 2003 VLDB 7.6816205e-05
3,488 Optimal Column Layout for Hybrid Workloads 2019 VLDB 7.0479329e-05
3,737 Skipping-oriented Partitioning for Columnar Layouts 2017 VLDB 6.8033227e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
4,061 Advanced Partitioning Techniques for Massively Distributed Computation 2012 SIGMOD 6.483587e-05
5,604 Design and Evaluation of Storage Organizations for Read-Optimized Main Memory Databases 2013 VLDB 5.4147933e-05
Previous Page 1 / 1 Next

Semantically Similar Papers