Back to papers
Constructing and Analyzing the LSM Compaction Design Space
Summary: Formalizes the LSM-compaction design space via four primitives—trigger, data layout, granularity, and data movement policy—enabling synthesis of existing and novel strategies. Empirically evaluates 10 strategies, reports 12 observations and 7 takeaways to help DB researchers navigate tradeoffs in write/read amplification and space for LSM engines.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12399
- Venue
- VLDB
- Year
- 2021
- Pagerank
- 6.7617833e-05
- Overall Rank
- 3,793 | 73.62%
- DOI
-
10.14778/3476249.3476274
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 24 of 24 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 3,965 |
Spooky: Granulating LSM-Tree Compactions Correctly |
2022 |
VLDB |
6.5820028e-05 |
| 4,227 |
Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine |
2022 |
VLDB |
6.3434324e-05 |
| 4,945 |
SplinterDB and Maplets: Improving the Tradeoffs in Key-Value Store Compaction Policy |
2023 |
SIGMOD |
5.8157107e-05 |
| 5,791 |
Dissecting, Designing, and Optimizing LSM-based Data Stores |
2022 |
SIGMOD |
5.3268999e-05 |
| 5,863 |
GRF: A Global Range Filter for LSM-Trees with Shape Encoding |
2024 |
SIGMOD |
5.2979639e-05 |
| 6,113 |
Compactionary: A Dictionary for LSM Compactions |
2022 |
SIGMOD |
5.20426e-05 |
| 6,398 |
Endure: A Robust Tuning Paradigm for LSM Trees Under Workload Uncertainty |
2022 |
VLDB |
5.0819209e-05 |
| 7,620 |
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads |
2023 |
SIGMOD |
4.693568e-05 |
| 8,009 |
CAMAL: Optimizing LSM-trees via Active Learning |
2024 |
SIGMOD |
4.6066863e-05 |
| 8,627 |
Limousine: Blending Learned and Classical Indexes to Self-Design Larger-than-Memory Cloud Storage Engines |
2024 |
SIGMOD |
4.4829101e-05 |
| 8,805 |
ArceKV: Towards Workload-driven LSM-compactions for Key-Value Store Under Dynamic Workloads |
2026 |
VLDB |
4.4466855e-05 |
| 8,834 |
ByteCard: Enhancing ByteDance’s Data Warehouse with Learned Cardinality Estimation |
2024 |
SIGMOD |
4.4394021e-05 |
| 8,876 |
MirrorKV: An Efficient Key-Value Store on Hybrid Cloud Storage with Balanced Performance of Compaction and Querying |
2023 |
SIGMOD |
4.4304279e-05 |
| 9,232 |
AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes |
2025 |
SIGMOD |
4.3690661e-05 |
| 9,362 |
FluidKV: Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint on Ultra-Fast Storage |
2024 |
VLDB |
4.3503444e-05 |
| 9,386 |
Rethinking The Compaction Policies in LSM-trees |
2025 |
SIGMOD |
4.3455975e-05 |
| 9,465 |
Disco: A Compact Index for LSM-trees |
2025 |
SIGMOD |
4.3350926e-05 |
| 9,529 |
Mnemosyne: Dynamic Workload-Aware BF Tuning via Accurate Statistics in LSM trees |
2025 |
SIGMOD |
4.32934e-05 |
| 10,176 |
Improving Range Scan Performance in LSM-trees with Group Caching |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,255 |
How to Write to SSDs |
2026 |
VLDB |
4.1945683e-05 |
| 10,407 |
MaLT: A Framework for Managing Large Transactions in OceanBase |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,558 |
BACH: Bridging Adjacency List and CSR Format using LSM-Trees for HGTAP Workloads |
2025 |
VLDB |
4.1945683e-05 |
| 10,676 |
Meaningful Data Erasure in the Presence of Dependencies |
2025 |
VLDB |
4.1945683e-05 |
| 11,049 |
On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDB |
2024 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 15 of 15 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 379 |
bLSM: A General Purpose Log Structured Merge Tree |
2012 |
SIGMOD |
0.0002493527 |
| 569 |
Optimizing Space Amplification in RocksDB |
2017 |
CIDR |
0.00019924098 |
| 609 |
Monkey: Optimal Navigable Key-Value Store |
2017 |
SIGMOD |
0.0001923446 |
| 1,311 |
Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging |
2018 |
SIGMOD |
0.00012657439 |
| 1,366 |
SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data |
2017 |
VLDB |
0.00012357685 |
| 1,438 |
AsterixDB: A Scalable, Open Source BDMS |
2014 |
VLDB |
0.00011973592 |
| 2,004 |
X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing |
2019 |
SIGMOD |
9.811707e-05 |
| 2,109 |
The Log-Structured Merge-Bush & the Wacky Continuum |
2019 |
SIGMOD |
9.5318694e-05 |
| 2,606 |
Design Continuums and the Path Toward Self-Designing Key-Value Stores that Know and Learn |
2019 |
CIDR |
8.4645832e-05 |
| 3,386 |
Lethe: A Tunable Delete-Aware LSM Engine |
2020 |
SIGMOD |
7.1577103e-05 |
| 3,544 |
Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores |
2020 |
SIGMOD |
6.9898874e-05 |
| 4,588 |
Leaper: A Learned Prefetcher for Cache Invalidation in LSM-tree based Storage Engines |
2020 |
VLDB |
6.0655418e-05 |
| 5,119 |
Design Tradeoffs of Data Access Methods |
2016 |
SIGMOD |
5.6807904e-05 |
| 5,308 |
Key-Value Storage Engines |
2020 |
SIGMOD |
5.576303e-05 |
| 6,231 |
An LSM-based Tuple Compaction Framework for Apache AsterixDB |
2020 |
VLDB |
5.1457863e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,049 |
On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDB |
2024 |
VLDB |
4.1945683e-05 |
| 1,960 |
Compaction management in distributed key-value datastores |
2015 |
VLDB |
9.9521444e-05 |
| 9,071 |
Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in A Colossal Configuration Space |
2024 |
SIGMOD |
4.4025274e-05 |
| 7,808 |
CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated Infrastructure |
2024 |
SIGMOD |
4.6455813e-05 |
| 7,743 |
Efficient Data Ingestion and Query Processing for LSM-Based Storage Systems |
2019 |
VLDB |
4.6626575e-05 |
| 4,914 |
On Performance Stability in LSM-based Storage Systems |
2020 |
VLDB |
5.8315684e-05 |
| 7,218 |
Breaking Down Memory Walls in LSM-based Storage Systems |
2020 |
SIGMOD |
4.7982543e-05 |
| 5,791 |
Dissecting, Designing, and Optimizing LSM-based Data Stores |
2022 |
SIGMOD |
5.3268999e-05 |
| 9,386 |
Rethinking The Compaction Policies in LSM-trees |
2025 |
SIGMOD |
4.3455975e-05 |
| 6,113 |
Compactionary: A Dictionary for LSM Compactions |
2022 |
SIGMOD |
5.20426e-05 |