Database Paper Browser

Back to papers

Parallel In-Situ Data Processing with Speculative Loading

Summary: SCANRAW is a parallel in-situ operator for raw files, merging loading with external tables to preserve zero time-to-query. It employs a parallel super-scalar pipeline with speculative loading to overlap queries and conversion, balancing CPU and I/O. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4803
Venue
SIGMOD
Year
2014
Pagerank
7.7902322e-05
Overall Rank
2,973 | 79.32%
DOI
10.1145/2588555.2593673

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 18 of 18 citing papers.

Rank Citing Paper Year Venue Pagerank
1,552 Overview of Data Exploration Techniques 2015 SIGMOD 0.00011408814
2,700 Filter Before You Parse: Faster Analytics on Raw Data with Sparser 2018 VLDB 8.2728509e-05
2,819 Mison: A Fast JSON Parser for Data Analytics 2017 VLDB 8.0651326e-05
3,437 Speculative Distributed CSV Data Parsing for Big Data Analytics 2019 SIGMOD 7.0942161e-05
3,465 GPL: A GPU-based Pipelined Query Processing Engine 2016 SIGMOD 7.0695873e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
4,326 Fast Queries Over Heterogeneous Data Through Engine Customization 2016 VLDB 6.288323e-05
4,704 JSON Tiles: Fast Analytics on Semi-Structured Data 2021 SIGMOD 5.9853687e-05
6,407 Just-In-Time Data Virtualization: Lightweight Data Management with ViDa 2015 CIDR 5.076547e-05
7,360 ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data 2020 VLDB 4.7525925e-05
7,830 Scalable Structural Index Construction for JSON Analytics 2021 VLDB 4.6388763e-05
8,788 FishStore: Faster Ingestion with Subset Hashing 2019 SIGMOD 4.451039e-05
9,379 GIO: Generating Efficient Matrix and Frame Readers for Custom Data Formats by Example 2023 SIGMOD 4.3462787e-05
9,918 Shared Load(ing): Efficient Bulk Loading into Optimized Storage 2020 CIDR 4.2561557e-05
10,381 LCP: Enhancing Scientific Data Management with Lossy Compression for Particles 2025 SIGMOD 4.1945683e-05
10,482 Fast and Scalable Data Transfer Across Data Systems 2025 SIGMOD 4.1945683e-05
11,784 Alpine: Efficient In situ Data Exploration in the Presence of Updates 2017 SIGMOD 4.1945683e-05
11,850 Vectorizing an In Situ Query Engine 2016 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 8 of 8 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers