Traversing Large Graphs on GPUs with Unified Memory
Summary: Evaluates BFS on large graphs with unified memory, pinpointing slowdowns from host-memory access and irregular data patterns. Proposes HALO (Harmonic Locality Ordering), an offline pre-processing step for static graphs that yields 1.5x-1.9x speedups and ties locality ordering to graph compression via recursive bisection. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Prasun Gera
- 2. Hyojong Kim
- 3. Piyush Sao
- 4. Hyesoon Kim
- 5. David Bader
Incoming Citations (Sorted by Pagerank)
Showing 9 of 9 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,103 | Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture | 2021 | VLDB | 0.00014025101 |
| 5,699 | EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs | 2021 | VLDB | 5.3654927e-05 |
| 5,799 | CGgraph: An Ultra-fast Graph Processing System on Modern Commodity CPU-GPU Co-processor | 2024 | VLDB | 5.3219334e-05 |
| 6,059 | Cache-Efficient Fork-Processing Patterns on Large Graphs | 2021 | SIGMOD | 5.2307519e-05 |
| 6,942 | Efficient Training of Graph Neural Networks on Large Graphs | 2024 | VLDB | 4.8922884e-05 |
| 6,985 | CompressGraph: Efficient Parallel Graph Analytics with Rule-Based Compression | 2023 | SIGMOD | 4.8729387e-05 |
| 7,225 | Self-adaptive Graph Traversal on GPUs | 2021 | SIGMOD | 4.7956162e-05 |
| 8,157 | TOD: GPU-accelerated Outlier Detection via Tensor Operations | 2023 | VLDB | 4.5730908e-05 |
| 10,705 | Efficient Graph Data Access for Out-of-Memory GPU Streaming Graph Processing | 2025 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 342 | EmptyHeaded: A Relational Engine for Graph Processing | 2016 | SIGMOD | 0.00026795977 |
| 1,676 | Speedup Graph Processing by Graph Ordering | 2016 | SIGMOD | 0.00010946423 |
| 1,973 | Speeding Up Set Intersections in Graph Algorithms using SIMD Instructions | 2018 | SIGMOD | 9.8913631e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,863 | Towards Sufficient GPU-accelerated Dynamic Graph Management: Survey and Experiment | 2025 | VLDB | 4.1945683e-05 |
| 1,676 | Speedup Graph Processing by Graph Ordering | 2016 | SIGMOD | 0.00010946423 |
| 4,577 | Accelerating Dynamic Graph Analytics on GPUs | 2018 | VLDB | 6.0709631e-05 |
| 5,799 | CGgraph: An Ultra-fast Graph Processing System on Modern Commodity CPU-GPU Co-processor | 2024 | VLDB | 5.3219334e-05 |
| 1,685 | Fast Iterative Graph Computation with Block Updates | 2013 | VLDB | 0.0001091808 |
| 3,641 | GPU-Accelerated Subgraph Enumeration on Partitioned Graphs | 2020 | SIGMOD | 6.8884895e-05 |
| 3,233 | iBFS: Concurrent Breadth-First Search on GPUs | 2016 | SIGMOD | 7.3361904e-05 |
| 5,699 | EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs | 2021 | VLDB | 5.3654927e-05 |
| 7,225 | Self-adaptive Graph Traversal on GPUs | 2021 | SIGMOD | 4.7956162e-05 |
| 4,522 | GPU-based Graph Traversal on Compressed Graphs | 2019 | SIGMOD | 6.1146374e-05 |