Database Paper Browser

Back to papers

Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads

Summary: Cross-industry traces show interactive, small-scale MapReduce workloads coexisting with large batch jobs. They reveal nonuniform data access and diverse, query-like patterns, motivating RDBMS-inspired techniques and a first step toward a MapReduce benchmark. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
10417
Venue
VLDB
Year
2012
Pagerank
0.0001488055
Overall Rank
979 | 93.20%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 21 of 21 citing papers.

Rank Citing Paper Year Venue Pagerank
1,873 An Architecture for Compiling UDF-centric Workflows 2015 VLDB 0.00010253002
2,418 Tupleware: "Big" Data, Big Analytics, Small Clusters 2015 CIDR 8.8556595e-05
2,548 An Evaluation of Distributed Concurrency Control 2017 VLDB 8.5652459e-05
3,547 Parallel Analytics as a Service 2013 SIGMOD 6.9862051e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
6,075 Opportunistic Physical Design for Big Data Analytics 2014 SIGMOD 5.223901e-05
6,104 Automating Distributed Tiered Storage Management in Cluster Computing 2020 VLDB 5.2080102e-05
6,821 Hadoop's Adolescence: An analysis of Hadoop usage in scientific workloads 2013 VLDB 4.9156923e-05
7,067 JetScope: Reliable and Interactive Analytics at Cloud Scale 2015 VLDB 4.8440936e-05
7,476 Lachesis: Automatic Partitioning for UDF-Centric Analytics 2021 VLDB 4.7188928e-05
8,002 Pangea: Monolithic Distributed Storage for Data Analytics 2019 VLDB 4.6088289e-05
8,924 QMapper for Smart Grid: Migrating SQL-based Application to Hive 2015 SIGMOD 4.427232e-05
9,066 Tempo: Robust and Self-Tuning Resource Management in Multi-tenant Parallel Databases 2016 VLDB 4.4035481e-05
9,546 Trident: Task Scheduling over Tiered Storage Systems in Big Data Platforms 2021 VLDB 4.3259935e-05
9,547 Optimistic Recovery for Iterative Dataflows in Action 2015 SIGMOD 4.3259935e-05
9,612 Workload-Aware CPU Performance Scaling for Transactional Database Systems 2018 SIGMOD 4.3177432e-05
11,197 QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in Spark 2023 SIGMOD 4.1945683e-05
11,885 The Case for Small Data Management 2015 CIDR 4.1945683e-05
11,907 Thrifty: Offering Parallel Database as a Service using the Shared-Process Approach 2015 SIGMOD 4.1945683e-05
12,001 Thoth: Towards Managing a Multi-System Cluster 2014 VLDB 4.1945683e-05
12,059 Workload Management for Big Data Analytics 2013 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 13 of 13 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers