Back to papers
Proteus: A Self-Designing Range Filter
Summary: Self-designing approximate range filter Proteus tunes itself from samples to minimize FPR under a fixed space. CPFPR unifies probabilistic and deterministic design spaces; in RocksDB it yields up to 5.3x end-to-end gains over SuRF/Rosetta with low modeling cost and robust workloads.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 6477
- Venue
- SIGMOD
- Year
- 2022
- Pagerank
- 5.8905445e-05
- Overall Rank
- 4,835 | 66.37%
- DOI
-
10.1145/3514221.3526167
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 22 of 22 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 5,446 |
Grafite: Taming Adversarial Queries with Optimal Range Filters |
2024 |
SIGMOD |
5.5018138e-05 |
| 5,739 |
InfiniFilter: Expanding Filters to Infinity and Beyond |
2023 |
SIGMOD |
5.3471718e-05 |
| 5,762 |
Oasis: An Optimal Disjoint Segmented Learned Range Filter |
2024 |
VLDB |
5.3377299e-05 |
| 5,863 |
GRF: A Global Range Filter for LSM-Trees with Shape Encoding |
2024 |
SIGMOD |
5.2979639e-05 |
| 7,620 |
Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads |
2023 |
SIGMOD |
4.693568e-05 |
| 8,009 |
CAMAL: Optimizing LSM-trees via Active Learning |
2024 |
SIGMOD |
4.6066863e-05 |
| 8,020 |
The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-Actions |
2024 |
VLDB |
4.6040862e-05 |
| 8,339 |
How to Grow an LSM-tree? Towards Bridging the Gap Between Theory and Practice |
2025 |
SIGMOD |
4.5434069e-05 |
| 8,525 |
Aleph Filter: To Infinity in Constant Time |
2024 |
VLDB |
4.4937074e-05 |
| 8,724 |
Memento Filter: A Fast, Dynamic, and Robust Range Filter |
2024 |
SIGMOD |
4.4600996e-05 |
| 8,805 |
ArceKV: Towards Workload-driven LSM-compactions for Key-Value Store Under Dynamic Workloads |
2026 |
VLDB |
4.4466855e-05 |
| 9,071 |
Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in A Colossal Configuration Space |
2024 |
SIGMOD |
4.4025274e-05 |
| 9,218 |
Diva: Dynamic Range Filter for Var-Length Keys and Queries |
2025 |
VLDB |
4.3702863e-05 |
| 9,317 |
Are Joins over LSM-trees Ready? Take RocksDB as an Example |
2025 |
VLDB |
4.3556432e-05 |
| 9,386 |
Rethinking The Compaction Policies in LSM-trees |
2025 |
SIGMOD |
4.3455975e-05 |
| 9,465 |
Disco: A Compact Index for LSM-trees |
2025 |
SIGMOD |
4.3350926e-05 |
| 9,987 |
A Multi-tenant Relational OLTP Database at Salesforce |
2026 |
CIDR |
4.1945683e-05 |
| 10,021 |
Hourglass: An Adaptive Range Filter with Lightweight Hybrid Encoding |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,137 |
Aeris Filter: A Strongly and Monotonically Adaptive Range Filter |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,176 |
Improving Range Scan Performance in LSM-trees with Group Caching |
2026 |
SIGMOD |
4.1945683e-05 |
| 11,075 |
LavaStore: ByteDance's Purpose-built, High-performance, Cost-effective Local Storage Engine for Cloud Services |
2024 |
VLDB |
4.1945683e-05 |
| 11,222 |
A Learned Cuckoo Filter for Approximate Membership Queries over Variable-sized Sliding Windows on Data Streams |
2023 |
SIGMOD |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 102 |
The Case for Learned Index Structures |
2018 |
SIGMOD |
0.00049545203 |
| 679 |
Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems |
2012 |
SIGMOD |
0.00018215154 |
| 1,169 |
SuRF: Practical Range Query Filtering with Fast Succinct Tries |
2018 |
SIGMOD |
0.00013536447 |
| 1,248 |
Don't Thrash: How to Cache Your Hash on Flash |
2012 |
VLDB |
0.00013046661 |
| 1,438 |
AsterixDB: A Scalable, Open Source BDMS |
2014 |
VLDB |
0.00011973592 |
| 1,460 |
Benchmarking Learned Indexes |
2021 |
VLDB |
0.00011887068 |
| 1,471 |
Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia |
2013 |
VLDB |
0.00011830111 |
| 1,515 |
vChain: Enabling Verifiable Boolean Range Queries over Blockchain Databases |
2019 |
SIGMOD |
0.00011591553 |
| 1,670 |
Amazon DynamoDB: A Seamlessly Scalable Non-relational Datastore |
2012 |
SIGMOD |
0.00010953756 |
| 2,157 |
The Data Calculator*: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models |
2018 |
SIGMOD |
9.416022e-05 |
| 3,544 |
Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores |
2020 |
SIGMOD |
6.9898874e-05 |
| 4,994 |
Stacked Filters: Learning to Filter by Structure |
2021 |
VLDB |
5.78027e-05 |
| 5,308 |
Key-Value Storage Engines |
2020 |
SIGMOD |
5.576303e-05 |
| 7,174 |
Coconut Palm: Static and Streaming Data Series Exploration Now in your Palm |
2019 |
SIGMOD |
4.8114555e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 6,803 |
Proteus: Autonomous Adaptive Storage for Mixed Workloads |
2022 |
SIGMOD |
4.9224958e-05 |
| 4,994 |
Stacked Filters: Learning to Filter by Structure |
2021 |
VLDB |
5.78027e-05 |
| 8,724 |
Memento Filter: A Fast, Dynamic, and Robust Range Filter |
2024 |
SIGMOD |
4.4600996e-05 |
| 10,137 |
Aeris Filter: A Strongly and Monotonically Adaptive Range Filter |
2026 |
SIGMOD |
4.1945683e-05 |
| 4,326 |
Fast Queries Over Heterogeneous Data Through Engine Customization |
2016 |
VLDB |
6.288323e-05 |
| 5,762 |
Oasis: An Optimal Disjoint Segmented Learned Range Filter |
2024 |
VLDB |
5.3377299e-05 |
| 5,446 |
Grafite: Taming Adversarial Queries with Optimal Range Filters |
2024 |
SIGMOD |
5.5018138e-05 |
| 3,611 |
SNARF: A Learning-Enhanced Range Filter |
2022 |
VLDB |
6.9191399e-05 |
| 1,169 |
SuRF: Practical Range Query Filtering with Fast Succinct Tries |
2018 |
SIGMOD |
0.00013536447 |
| 3,544 |
Rosetta: A Robust Space-Time Optimized Range Filter for Key-Value Stores |
2020 |
SIGMOD |
6.9898874e-05 |