Back to papers
Split Query Processing in Polybase
Summary: Polybase uses split query processing to push SQL operators on HDFS data into MapReduce jobs on Hadoop via PDW optimizer. A cost-based planner weighs selectivity, sizes, and co-location to decide pushdown; SQL–Java semantics must be reconciled.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 4642
- Venue
- SIGMOD
- Year
- 2013
- Pagerank
- 9.8824589e-05
- Overall Rank
- 1,977 | 86.25%
- DOI
-
-
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 21 of 21 citing papers.
| Rank |
Citing Paper |
Year |
Venue |
Pagerank |
| 167 |
The Snowflake Elastic Data Warehouse |
2016 |
SIGMOD |
0.00039180521 |
| 1,800 |
epiC: an Extensible and Scalable System for Processing Big Data |
2014 |
VLDB |
0.00010512649 |
| 2,127 |
SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures |
2014 |
VLDB |
9.4863172e-05 |
| 2,249 |
Orca: A Modular Query Optimizer Architecture for Big Data |
2014 |
SIGMOD |
9.2034693e-05 |
| 2,322 |
Instant Loading for Main Memory Databases |
2013 |
VLDB |
9.034874e-05 |
| 3,066 |
HAWQ: A Massively Parallel Processing SQL Engine in Hadoop |
2014 |
SIGMOD |
7.6221974e-05 |
| 3,265 |
RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! - |
2018 |
VLDB |
7.3083672e-05 |
| 3,548 |
Adaptive Query Processing on RAW Data |
2014 |
VLDB |
6.9859242e-05 |
| 3,562 |
MISO: Souping Up Big Data Query Processing with a Multistore System |
2014 |
SIGMOD |
6.9694564e-05 |
| 3,891 |
Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing |
2017 |
VLDB |
6.659442e-05 |
| 4,326 |
Fast Queries Over Heterogeneous Data Through Engine Customization |
2016 |
VLDB |
6.288323e-05 |
| 5,301 |
ReCache: Reactive Caching for Fast Analytics over Heterogeneous Data |
2018 |
VLDB |
5.5790928e-05 |
| 6,089 |
The CloudMdsQL Multistore System |
2016 |
SIGMOD |
5.2175982e-05 |
| 6,242 |
Helios: Hyperscale Indexing for the Cloud & Edge |
2020 |
VLDB |
5.1408379e-05 |
| 6,407 |
Just-In-Time Data Virtualization: Lightweight Data Management with ViDa |
2015 |
CIDR |
5.076547e-05 |
| 7,918 |
Indexing HDFS Data in PDW: Splitting the data from the index |
2014 |
VLDB |
4.6170838e-05 |
| 9,607 |
Polyglot Data Management: State of the Art & Open Challenges |
2022 |
VLDB |
4.3177432e-05 |
| 9,894 |
OceanRT: Real-Time Analytics over Large Temporal Data |
2014 |
SIGMOD |
4.2602616e-05 |
| 10,482 |
Fast and Scalable Data Transfer Across Data Systems |
2025 |
SIGMOD |
4.1945683e-05 |
| 10,591 |
Accio: Bolt-on Query Federation |
2025 |
VLDB |
4.1945683e-05 |
| 12,005 |
Design and Implementation of a Real-Time Interactive Analytics System for Large Spatio-Temporal Data |
2014 |
VLDB |
4.1945683e-05 |
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 11,690 |
Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology |
2019 |
VLDB |
4.1945683e-05 |
| 7,205 |
Unified Query Optimization in the Fabric Data Warehouse |
2024 |
SIGMOD |
4.8014977e-05 |
| 9,957 |
How to Optimize SQL Queries? A Comparison Between Split, Holistic, and Hybrid Approaches |
2025 |
VLDB |
4.2373024e-05 |
| 2,413 |
Automated Partitioning Design in Parallel Database Systems |
2011 |
SIGMOD |
8.8672223e-05 |
| 2,127 |
SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures |
2014 |
VLDB |
9.4863172e-05 |
| 157 |
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads |
2009 |
VLDB |
0.00040397359 |
| 5,014 |
Dynamically Optimizing Queries over Large Scale Data Platforms |
2014 |
SIGMOD |
5.7586174e-05 |
| 7,918 |
Indexing HDFS Data in PDW: Splitting the data from the index |
2014 |
VLDB |
4.6170838e-05 |
| 2,241 |
Query Optimization in Microsoft SQL Server PDW |
2012 |
SIGMOD |
9.2191212e-05 |
| 2,337 |
Efficient Processing of Data Warehousing Queries in a Split Execution Environment |
2011 |
SIGMOD |
9.0098186e-05 |