AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language Models
Summary: AquaPipe pipelines disk-based ANNS with LLM prefill to overlap retrieval and inference in RAG. Recall-aware prefetching, adaptive prefill, and dynamic chunking balance latency and GPU efficiency, delivering 56–99% masking, 1.3–2.6x speedups, and ~1% recall loss. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Runjie Yu
- 2. Weizhou Huang
- 3. Shuhan Bai
- 4. Jian Zhou
- 5. Fei Wu
Incoming Citations (Sorted by Pagerank)
Showing 3 of 3 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,086 | High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU | 2026 | SIGMOD | 4.1945683e-05 |
| 10,111 | Scalable Graph Indexing using GPUs for Approximate Nearest Neighbor Search | 2026 | SIGMOD | 4.1945683e-05 |
| 10,130 | MorphingDB: A Task-Centric AI-Native DBMS for Model Management and Inference | 2026 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 12 of 12 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next