Streaming Similarity Search over one Billion Tweets using Parallel Locality-Sensitive Hashing
Summary: Parallel LSH for streaming similarity search on >1B tweets; scalable across nodes and cores. Innovations: cache-conscious hash tables, 2-level merge for construction, duplicate-elimination during querying, insert-optimized structures and streaming expiration, plus a performance model; yields 1–2.5 ms queries and ~8x speedups over basic LSH and inverted indexes. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Narayanan Sundaram
- 2. Aizana Turmukhametova
- 3. Nadathur Satish
- 4. Todd Mostak
- 5. Piotr Indyk
- 6. Samuel Madden
- 7. Pradeep Dubey
Incoming Citations (Sorted by Pagerank)
Showing 12 of 12 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 233 | A Study of Index Structures for Main Memory Database Management Systems | 1986 | VLDB | 0.00032021526 |
| 251 | Robust and Fast Similarity Search for Moving Object Trajectories | 2005 | SIGMOD | 0.00030644658 |
| 351 | Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs | 2009 | VLDB | 0.0002636504 |
| 572 | Substructure Similarity Search in Graph Databases | 2005 | SIGMOD | 0.00019887011 |
| 877 | Effective Keyword Search in Relational Databases | 2006 | SIGMOD | 0.00015714014 |
| 930 | Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious SIMD Sort | 2010 | SIGMOD | 0.00015238545 |
| 1,944 | WHAM: A High-throughput Sequence Alignment Method | 2011 | SIGMOD | 0.00010004608 |
Previous
Page 1 / 1
Next