ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data
Summary: GPU-based parallel parser for delim-separated data; on-the-fly FSM resolves mid-stream context. Achieves 14.2 GB/s on 3584 cores; end-to-end 4.8 GB in 0.44 s incl. transfers, and scales to thousands of cores while supporting multi-format parsing. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 6 of 6 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,476 | Parallel Index-based Stream Join on a Multicore CPU | 2020 | SIGMOD | 5.0496617e-05 |
| 7,830 | Scalable Structural Index Construction for JSON Analytics | 2021 | VLDB | 4.6388763e-05 |
| 9,379 | GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example | 2023 | SIGMOD | 4.3462787e-05 |
| 9,837 | GpJSON: High-performance JSON Data Processing on GPUs | 2025 | VLDB | 4.2740344e-05 |
| 9,953 | Distributed Stream KNN Join | 2021 | SIGMOD | 4.2405999e-05 |
| 10,482 | Fast and Scalable Data Transfer Across Data Systems | 2025 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 17 of 17 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 6,324 | Revisiting Pipelined Parallelism in Multi-Join Query Processing | 2005 | VLDB | 5.1109987e-05 |
| 10,495 | Parallel k-Core Decomposition: Theory and Practice | 2025 | SIGMOD | 4.1945683e-05 |
| 9,330 | Parallel Query Processing: To Separate Communication from Computation | 2022 | SIGMOD | 4.3556432e-05 |
| 3,548 | Adaptive Query Processing on RAW Data | 2014 | VLDB | 6.9859242e-05 |
| 6,629 | A Holistic View of Stream Partitioning Costs | 2017 | VLDB | 4.9880986e-05 |
| 4,671 | Realtime Top-k Personalized PageRank over Large Graphs on GPUs | 2020 | VLDB | 6.0085645e-05 |
| 5,045 | Massive Scale-out of Expensive Continuous Queries | 2011 | VLDB | 5.740793e-05 |
| 2,973 | Parallel In-Situ Data Processing with Speculative Loading | 2014 | SIGMOD | 7.7902322e-05 |
| 2,700 | Filter Before You Parse: Faster Analytics on Raw Data with Sparser | 2018 | VLDB | 8.2728509e-05 |
| 3,437 | Speculative Distributed CSV Data Parsing for Big Data Analytics | 2019 | SIGMOD | 7.0942161e-05 |