Back to papers
Cracking Vector Search Indexes
Summary: CrackIVF: a partition‑based, adaptive ANNS index that incrementally “cracks” partitions from live RAG queries to serve embedding data lakes without costly upfront builds. Progressively converges to conventional index quality, providing immediate access and 10–1000× faster initialization for cold or infrequently used datasets.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 14014
- Venue
- VLDB
- Year
- 2025
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,711 | 25.49%
- DOI
-
10.14778/3749646.3749666
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 23 of 23 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 2 |
R-Trees: A Dynamic Index Structure For Spatial Searching |
1984 |
SIGMOD |
0.0032169493 |
| 79 |
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces |
1998 |
VLDB |
0.00056242144 |
| 408 |
Database Cracking |
2007 |
CIDR |
0.00023953844 |
| 495 |
Milvus: A Purpose-Built Vector Data Management System |
2021 |
SIGMOD |
0.00021767688 |
| 1,377 |
Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics |
2021 |
CIDR |
0.00012296941 |
| 1,541 |
Symphony: Towards Natural Language Query Answering over Multi-modal Data Lakes |
2023 |
CIDR |
0.00011456579 |
| 2,106 |
Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing |
2025 |
CIDR |
9.5342543e-05 |
| 2,262 |
Manu: A Cloud Native Vector Database Management System |
2022 |
VLDB |
9.1624446e-05 |
| 2,320 |
High-Throughput Vector Similarity Search in Knowledge Graphs |
2023 |
SIGMOD |
9.0366225e-05 |
| 2,730 |
Open Data Integration |
2018 |
VLDB |
8.2126735e-05 |
| 3,680 |
SingleStore-V: An Integrated Vector Database System in SingleStore |
2024 |
VLDB |
6.8496415e-05 |
| 3,772 |
FEXIPRO: Fast and Exact Inner Product Retrieval in Recommender Systems |
2017 |
SIGMOD |
6.7761705e-05 |
| 3,876 |
The Design of an LLM-powered Unstructured Analytics System |
2025 |
CIDR |
6.6741456e-05 |
| 4,106 |
Extracting Databases from Dark Data with DeepDive |
2016 |
SIGMOD |
6.4456184e-05 |
| 4,506 |
Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores |
2012 |
VLDB |
6.1319277e-05 |
| 4,755 |
Indexing for Interactive Exploration of Big Data Series |
2014 |
SIGMOD |
5.946863e-05 |
| 5,129 |
Navigating Labels and Vectors: A Unified Approach to Filtered Approximate Nearest Neighbor Search |
2024 |
SIGMOD |
5.6755204e-05 |
| 7,095 |
Dumpy: A Compact and Adaptive Index for Large Data Series Collections |
2023 |
SIGMOD |
4.8350023e-05 |
| 7,606 |
Tribase: A Vector Data Query Engine for Reliable and Lossless Pruning Compression using Triangle Inequalities |
2025 |
SIGMOD |
4.6967106e-05 |
| 7,879 |
PDX: A Data Layout for Vector Similarity Search |
2025 |
SIGMOD |
4.6292417e-05 |
| 9,103 |
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference |
2025 |
SIGMOD |
4.3958197e-05 |
| 9,283 |
Adaptive Indexing in High-Dimensional Metric Spaces |
2023 |
VLDB |
4.3631652e-05 |
| 9,767 |
Adaptive Indexing of Objects with Spatial Extent |
2023 |
VLDB |
4.2856106e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 9,978 |
Fast Vector Search in PostgreSQL: A Decoupled Approach |
2026 |
CIDR |
4.1945683e-05 |
| 9,480 |
Cost-Effective, Low Latency Vector Search with Azure Cosmos DB |
2025 |
VLDB |
4.3341665e-05 |
| 10,170 |
From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation |
2026 |
SIGMOD |
4.1945683e-05 |
| 408 |
Database Cracking |
2007 |
CIDR |
0.00023953844 |
| 10,204 |
Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search from Task-Centric Views: [Experiments & Analysis] |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,058 |
Building Stateless Serverless Vector DBs via Block-based Data Partitioning |
2026 |
SIGMOD |
4.1945683e-05 |
| 3,565 |
Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation |
2025 |
SIGMOD |
6.9655362e-05 |
| 10,654 |
HAKES: Scalable Vector Database for Embedding Search Service |
2025 |
VLDB |
4.1945683e-05 |
| 10,760 |
Turbocharging Vector Databases using Modern SSDs |
2025 |
VLDB |
4.1945683e-05 |
| 8,245 |
MIRAGE-ANNS: Mixed Approach Graph-based Indexing for Approximate Nearest Neighbor Search |
2025 |
SIGMOD |
4.5514956e-05 |