New Query Optimization Techniques in the Spark Engine of Azure Synapse
Summary: Novel exchange placement in Spark-based Azure Synapse reduces data shuffles and enables multi-consumer reuse. Push-downs (aggregates, semi-joins) and peephole tweaks target scale-out stateful ops, achieving 1.8x on TPC-DS vs Spark 3.0.1. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Abhishek Modi
- 2. Kaushik Rajan
- 3. Srinivas Thimmaiah
- 4. Prakhar Jain
- 5. Swinky Mann
- 6. Ayushi Agarwal
- 7. Ajith Shetty
- 8. Shahid K I
- 9. Ashit Gosalia
- 10. Partho Sarthi
Incoming Citations (Sorted by Pagerank)
Showing 4 of 4 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,023 | GenRewrite: Query Rewriting via Large Language Models | 2026 | SIGMOD | 5.75363e-05 |
| 10,121 | TQEx: Tensor-based Query Engine Enhanced by Bridging the Gap | 2026 | SIGMOD | 4.1945683e-05 |
| 10,749 | Scaling GPU-Accelerated Databases beyond GPU Memory Size | 2025 | VLDB | 4.1945683e-05 |
| 11,267 | Anser: Adaptive Information Sharing Framework of AnalyticDB | 2023 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 14 of 14 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,124 | Dynamic Speculative Optimizations for SQL Compilation in Apache Spark | 2020 | VLDB | 4.391961e-05 |
| 6,209 | AutoExecutor: Predictive Parallelism for Spark SQL Queries | 2021 | VLDB | 5.1565972e-05 |
| 11,197 | QaaD (Query-as-a-Data): Scalable Execution of Massive Number of Small Queries in Spark | 2023 | SIGMOD | 4.1945683e-05 |
| 2,241 | Query Optimization in Microsoft SQL Server PDW | 2012 | SIGMOD | 9.2191212e-05 |
| 5,014 | Dynamically Optimizing Queries over Large Scale Data Platforms | 2014 | SIGMOD | 5.7586174e-05 |
| 6,673 | Incorporating Super-Operators in Big-Data Query Optimizers | 2020 | VLDB | 4.966799e-05 |
| 8,197 | SparkCruise: Workload Optimization in Managed Spark Clusters at Microsoft | 2021 | VLDB | 4.5607121e-05 |
| 9,305 | Parallelizing Query Optimization on Shared-Nothing Architectures | 2016 | VLDB | 4.3577129e-05 |
| 5,297 | Continuous Cloud-Scale Query Optimization and Processing | 2013 | VLDB | 5.5801669e-05 |
| 8,617 | A Spark Optimizer for Adaptive, Fine-Grained Parameter Tuning | 2024 | VLDB | 4.4846425e-05 |