Scheduling Shared Scans of Large Data Files
Summary: Studies scheduling scans of large shared data files under many concurrent requests to maximize throughput via aggressive cross-job IO sharing. Proposes a family of sharable-workload policies that deprioritize scans when future sharable demand is high, with simulations on synthetic and real workloads showing gains over SJF baselines. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Parag Agrawal
- 2. Daniel Kifer
- 3. Christopher Olston
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 947 | MRShare: Sharing Across Multiple Queries in MapReduce | 2010 | VLDB | 0.00015114576 |
| 1,874 | Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems | 2014 | SIGMOD | 0.00010244443 |
| 2,205 | ReStore: Reusing Results of MapReduce Jobs | 2012 | VLDB | 9.2920002e-05 |
| 2,311 | On Improving User Response Times in Tableau | 2015 | SIGMOD | 9.0539767e-05 |
| 3,062 | Efficient Multi-way Theta-Join Processing Using MapReduce | 2012 | VLDB | 7.6343994e-05 |
| 7,689 | ROBUS: Fair Cache Allocation for Data-parallel Workloads | 2017 | SIGMOD | 4.6765769e-05 |
| 11,976 | Anti-Combining for MapReduce | 2014 | SIGMOD | 4.1945683e-05 |
| 12,287 | LifeRaft: Data-Driven, Batch Processing for the Exploration of Scientific Databases | 2009 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 515 | QPipe: A Simultaneously Pipelined Relational Query Engine | 2005 | SIGMOD | 0.00021214633 |
| 1,026 | Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS | 2007 | VLDB | 0.00014589172 |
| 2,113 | Red Brick Warehouse: A Read-Mostly RDBMS for Open SMP Platforms | 1994 | SIGMOD | 9.5276869e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,386 | File Allocation in Distributed Databases with Interaction between Files | 1983 | VLDB | 4.7441292e-05 |
| 4,282 | Scaling Up Concurrent Main-Memory Column-Store Scans: Towards Adaptive NUMA-aware Data and Task Placement | 2015 | VLDB | 6.293052e-05 |
| 8,682 | Efficient Scheduling of Heterogeneous Continuous Queries | 2006 | VLDB | 4.4687791e-05 |
| 2,459 | Multi-dimensional Resource Scheduling for Parallel Queries | 1996 | SIGMOD | 8.7676516e-05 |
| 830 | Main-Memory Scan Sharing For Multi-Core CPUs | 2008 | VLDB | 0.00016171897 |
| 3,124 | Parallel Query Scheduling and Optimization with Time- and Space-Shared Resources | 1997 | VLDB | 7.5201555e-05 |
| 947 | MRShare: Sharing Across Multiple Queries in MapReduce | 2010 | VLDB | 0.00015114576 |
| 2,925 | Shared Workload Optimization | 2014 | VLDB | 7.888494e-05 |
| 3,703 | Multi-Query Optimization in MapReduce Framework | 2014 | VLDB | 6.8289978e-05 |
| 13,380 | Job Scheduling with Minimizing Data Communication Costs | 2015 | SIGMOD | - |