Parallel In-Situ Data Processing with Speculative Loading
Summary: SCANRAW is a parallel in-situ operator for raw files, merging loading with external tables to preserve zero time-to-query. It employs a parallel super-scalar pipeline with speculative loading to overlap queries and conversion, balancing CPU and I/O. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yu Cheng
- 2. Florin Rusu
Incoming Citations (Sorted by Pagerank)
Showing 18 of 18 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 115 | Eddies: Continuously Adaptive Query Processing | 2000 | SIGMOD | 0.00046221215 |
| 928 | Requirements for Science Data Bases and SciDB | 2009 | CIDR | 0.00015247726 |
| 1,299 | The DataPath System: A Data-Centric Analytic Processing Engine for Large Data Warehouses | 2010 | SIGMOD | 0.00012751522 |
| 1,343 | NoDB: Efficient Query Execution on Raw Data Files | 2012 | SIGMOD | 0.00012482538 |
| 2,017 | The Researcher’s Guide to the Data Deluge: Querying a Scientific Database in Just a Few Seconds | 2011 | VLDB | 9.7810458e-05 |
| 2,322 | Instant Loading for Main Memory Databases | 2013 | VLDB | 9.034874e-05 |
| 2,363 | Merging What’s Cracked, Cracking What’s Merged: Adaptive Indexing in Main-Memory Column-Stores | 2011 | VLDB | 8.9580928e-05 |
| 2,367 | Here are my Data Files. Here are my Queries. Where are my Results? | 2011 | CIDR | 8.9511058e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,429 | A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses | 2009 | VLDB | 0.00012033518 |
| 2,367 | Here are my Data Files. Here are my Queries. Where are my Results? | 2011 | CIDR | 8.9511058e-05 |
| 12,266 | Ten Thousand SQLs: Parallel Keyword Queries Computing | 2010 | VLDB | 4.1945683e-05 |
| 6,304 | Elastic Pipelining in an In-Memory Database Cluster | 2016 | SIGMOD | 5.1210182e-05 |
| 3,891 | Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing | 2017 | VLDB | 6.659442e-05 |
| 7,360 | ParPaRaw: Massively Parallel Parsing of Delimiter-Separated Raw Data | 2020 | VLDB | 4.7525925e-05 |
| 2,322 | Instant Loading for Main Memory Databases | 2013 | VLDB | 9.034874e-05 |
| 11,850 | Vectorizing an In Situ Query Engine | 2016 | SIGMOD | 4.1945683e-05 |
| 1,343 | NoDB: Efficient Query Execution on Raw Data Files | 2012 | SIGMOD | 0.00012482538 |
| 3,548 | Adaptive Query Processing on RAW Data | 2014 | VLDB | 6.9859242e-05 |