Back to papers
Experimental Analysis of Large-scale Learnable Vector Storage Compression
Summary: Taxonomy and comprehensive benchmark of 14 embedding-compression methods for large-scale learnable vectors using a uniform testbed. Quantifies memory–quality trade-offs, recommends per-use-case winners, and exposes method limitations and research gaps.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 13756
- Venue
- VLDB
- Year
- 2024
- Pagerank
- 4.3441378e-05
- Overall Rank
- 9,408 | 34.56%
- DOI
-
10.14778/3636218.3636234
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 16 of 16 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 754 |
Distributed Representations of Tuples for Entity Resolution |
2018 |
VLDB |
0.00017117211 |
| 984 |
Natural language to SQL: Where are we today? |
2020 |
VLDB |
0.00014857465 |
| 1,366 |
SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data |
2017 |
VLDB |
0.00012357685 |
| 2,152 |
MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis |
2018 |
SIGMOD |
9.4239787e-05 |
| 2,262 |
Manu: A Cloud Native Vector Database Management System |
2022 |
VLDB |
9.1624446e-05 |
| 2,677 |
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework |
2022 |
VLDB |
8.3268401e-05 |
| 3,169 |
QueryFormer: A Tree Transformer Model for Query Plan Representation |
2022 |
VLDB |
7.4498425e-05 |
| 3,499 |
Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation |
2021 |
VLDB |
7.0376445e-05 |
| 3,803 |
Scaling Attributed Network Embedding to Massive Graphs |
2021 |
VLDB |
6.7550628e-05 |
| 4,462 |
LOGER: A Learned Optimizer towards Generating Efficient and Robust Query Execution Plans |
2023 |
VLDB |
6.1611784e-05 |
| 5,052 |
HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training |
2022 |
SIGMOD |
5.7337977e-05 |
| 5,377 |
Parallel Training of Knowledge Graph Embedding Models: A Comparison of Techniques |
2022 |
VLDB |
5.5410858e-05 |
| 6,738 |
Agile and Accurate CTR Prediction Model Training for Massive-Scale Online Advertising Systems |
2021 |
SIGMOD |
4.9452647e-05 |
| 7,061 |
Serving Deep Learning Models with Deduplication from Relational Databases |
2022 |
VLDB |
4.8463881e-05 |
| 7,256 |
Effective and Efficient Retrieval of Structured Entities |
2020 |
VLDB |
4.7869419e-05 |
| 7,474 |
Cardinality Estimation of Approximate Substring Queries using Deep Learning |
2022 |
VLDB |
4.7194345e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 8,657 |
Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices |
2022 |
VLDB |
4.4730648e-05 |
| 4,731 |
Graph-Based Vector Search: An Experimental Evaluation of the State-of-the-Art |
2025 |
SIGMOD |
5.966659e-05 |
| 2,862 |
An Experimental Study of Bitmap Compression vs. Inverted List Compression |
2017 |
SIGMOD |
7.9898539e-05 |
| 11,574 |
An Evaluation of Methods of Compressing Doubles |
2020 |
SIGMOD |
4.1945683e-05 |
| 3,745 |
DeepSqueeze: Deep Semantic Compression for Tabular Data |
2020 |
SIGMOD |
6.7926132e-05 |
| 7,429 |
CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases |
2022 |
SIGMOD |
4.7320139e-05 |
| 10,741 |
Beyond Compression: A Comprehensive Evaluation of Lossless Floating-Point Compression |
2025 |
VLDB |
4.1945683e-05 |
| 9,402 |
CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models |
2024 |
SIGMOD |
4.3441378e-05 |
| 3,609 |
Similarity search in the blink of an eye with compressed indices |
2023 |
VLDB |
6.9215236e-05 |
| 1,967 |
Compressed Linear Algebra for Large-Scale Machine Learning |
2016 |
VLDB |
9.9131712e-05 |