Back to papers
ContextCache: Context-Aware Semantic Cache for Multi-Turn Queries in Large Language Models
Summary: ContextCache: a context-aware semantic cache for multi-turn LLM dialogues that first retrieves vector-based candidates for the current query, then refines matches by self-attention over current and historical dialogue representations. Improves precision/recall over per-query caches on real conversations and delivers ~10× lower latency for cached responses, cutting LLM compute costs.
(summarized by gpt-5-mini on Feb 09 2026)
- Paper ID
- 14164
- Venue
- VLDB
- Year
- 2025
- Pagerank
- -
- Overall Rank
- 13,135 | 8.63%
- DOI
-
10.14778/3750601.3750679
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
Outgoing Citations (Sorted by Pagerank)
Showing 0 of 0 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 9,103 |
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference |
2025 |
SIGMOD |
4.3958197e-05 |
| 10,148 |
Chatty-KG: A Multi-Agent AI System for On-Demand Conversational Question Answering over Knowledge Graphs |
2026 |
SIGMOD |
4.1945683e-05 |
| 9,234 |
Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL |
2025 |
VLDB |
4.3690661e-05 |
| 13,088 |
OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration |
2026 |
VLDB |
- |
| 10,170 |
From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation |
2026 |
SIGMOD |
4.1945683e-05 |
| 10,374 |
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,222 |
RetroInfer: A Vector Storage Engine for Scalable Long-Context LLM Inference |
2026 |
VLDB |
4.1945683e-05 |
| 10,066 |
DepCache: A KV Cache Management Framework for GraphRAG with Dependency Attention |
2026 |
SIGMOD |
4.1945683e-05 |
| 6,357 |
PQCache: Product Quantization-based KVCache for Long Context LLM Inference |
2025 |
SIGMOD |
5.0970739e-05 |
| 3,565 |
Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation |
2025 |
SIGMOD |
6.9655362e-05 |