Magnet: Push-based Shuffle Service for Large-scale Data Processing
Summary: Magnet introduces a push-based shuffle service that merges fragmented intermediate data into large blocks, co-locating with reduce tasks. Scales to petabytes on thousands of nodes, on-prem and cloud, delivering ~30% speedups and reducing tuning burden. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Min Shen
- 2. Ye Zhou
- 3. Chandni Singh
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 7,907 | Petabyte-Scale Row-Level Operations in Data Lakehouses | 2024 | VLDB | 4.6205839e-05 |
| 8,506 | New Query Optimization Techniques in the Spark Engine of Azure Synapse | 2022 | VLDB | 4.4957661e-05 |
| 9,155 | Towards Resource Efficiency: Practical Insights into Large-Scale Spark Workloads at ByteDance | 2024 | VLDB | 4.3849295e-05 |
| 9,699 | The Story of AWS Glue | 2023 | VLDB | 4.3018844e-05 |
| 10,491 | Intra-Query Runtime Elasticity for Cloud-Native Data Analysis | 2025 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 66 | Spark SQL: Relational Data Processing in Spark | 2015 | SIGMOD | 0.00061639801 |
| 1,071 | Starfish: A Self-tuning System for Big Data Analytics | 2011 | CIDR | 0.00014312777 |
| 4,248 | Hyper Dimension Shuffle: Efficient Data Repartition at Petabyte Scale in SCOPE | 2019 | VLDB | 6.3247927e-05 |
Previous
Page 1 / 1
Next