Database Paper Browser

Back to papers

Sieve: A Learned Data-Skipping Index for Data Analytics

Summary: Sieve is a learned data-skipping index that models block-level value distributions with piecewise-linear functions to capture real-world patterns missed by per-block min/max or histograms. By grouping adjacent keys into regions and trading storage for fewer false positives, Sieve cuts blocks accessed up to 80% and query time by 42% in Presto evaluations. (summarized by gpt-5-mini on Feb 09 2026)

Paper ID
13158
Venue
VLDB
Year
2023
Pagerank
4.5555621e-05
Overall Rank
8,222 | 42.81%
DOI
10.14778/3611479.3611520

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 4 of 4 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 20 of 20 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
102 The Case for Learned Index Structures 2018 SIGMOD 0.00049545203
241 DB2 with BLU Acceleration: So Much More than Just a Column Store 2013 VLDB 0.00031420034
368 Small Materialized Aggregates: A Light Weight Index Structure for Data Warehousing 1998 VLDB 0.000254931
826 ALEX: An Updatable Adaptive Learned Index 2020 SIGMOD 0.00016224841
857 The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds 2020 VLDB 0.00015882892
1,375 FITing-Tree: A Data-aware Index Structure 2019 SIGMOD 0.00012303141
1,913 BF-Tree: Approximate Tree Indexing 2014 VLDB 0.00010113937
1,989 Column Imprints: A Secondary Index Structure 2013 SIGMOD 9.8478437e-05
2,140 Online Piece-wise Linear Approximation of Numerical Streams with Precision Guarantees* 2009 VLDB 9.4626098e-05
3,152 AnalyticDB: Real-time OLAP Database System at Alibaba Cloud 2019 VLDB 7.4711766e-05
3,608 Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation 2018 SIGMOD 6.924272e-05
3,737 Skipping-oriented Partitioning for Columnar Layouts 2017 VLDB 6.8033227e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
3,912 Two Birds, One Stone: A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems 2017 VLDB 6.6354964e-05
3,922 Pushing Data-Induced Predicates Through Joins in Big-Data Clusters 2020 VLDB 6.6291079e-05
4,158 Performance-Optimal Filtering: Bloom Overtakes Cuckoo at High Throughput 2019 VLDB 6.3994318e-05
5,315 Cuckoo Index: A Lightweight Secondary Index Structure 2020 VLDB 5.5723424e-05
5,428 The Price of Tailoring the Index to Your Data: Poisoning Attacks on Learned Index Structures 2022 SIGMOD 5.5091613e-05
6,850 Petabyte Scale Databases and Storage Systems at Facebook 2013 SIGMOD 4.9085019e-05
9,665 Fingerprints for Compressed Columnar Data Search 2019 SIGMOD 4.3082524e-05
Previous Page 1 / 1 Next

Semantically Similar Papers