Database Paper Browser

Back to papers

Split Query Processing in Polybase

Summary: Polybase uses split query processing to push SQL operators on HDFS data into MapReduce jobs on Hadoop via PDW optimizer. A cost-based planner weighs selectivity, sizes, and co-location to decide pushdown; SQL–Java semantics must be reconciled. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
4642
Venue
SIGMOD
Year
2013
Pagerank
9.8824589e-05
Overall Rank
1,977 | 86.25%
DOI
-

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 21 of 21 citing papers.

Rank Citing Paper Year Venue Pagerank
167 The Snowflake Elastic Data Warehouse 2016 SIGMOD 0.00039180521
1,800 epiC: an Extensible and Scalable System for Processing Big Data 2014 VLDB 0.00010512649
2,127 SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures 2014 VLDB 9.4863172e-05
2,249 Orca: A Modular Query Optimizer Architecture for Big Data 2014 SIGMOD 9.2034693e-05
2,322 Instant Loading for Main Memory Databases 2013 VLDB 9.034874e-05
3,066 HAWQ: A Massively Parallel Processing SQL Engine in Hadoop 2014 SIGMOD 7.6221974e-05
3,265 RHEEM: Enabling Cross-Platform Data Processing - May The Big Data Be With You! - 2018 VLDB 7.3083672e-05
3,548 Adaptive Query Processing on RAW Data 2014 VLDB 6.9859242e-05
3,562 MISO: Souping Up Big Data Query Processing with a Multistore System 2014 SIGMOD 6.9694564e-05
3,891 Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing 2017 VLDB 6.659442e-05
4,326 Fast Queries Over Heterogeneous Data Through Engine Customization 2016 VLDB 6.288323e-05
5,301 ReCache: Reactive Caching for Fast Analytics over Heterogeneous Data 2018 VLDB 5.5790928e-05
6,089 The CloudMdsQL Multistore System 2016 SIGMOD 5.2175982e-05
6,242 Helios: Hyperscale Indexing for the Cloud & Edge 2020 VLDB 5.1408379e-05
6,407 Just-In-Time Data Virtualization: Lightweight Data Management with ViDa 2015 CIDR 5.076547e-05
7,918 Indexing HDFS Data in PDW: Splitting the data from the index 2014 VLDB 4.6170838e-05
9,607 Polyglot Data Management: State of the Art & Open Challenges 2022 VLDB 4.3177432e-05
9,894 OceanRT: Real-Time Analytics over Large Temporal Data 2014 SIGMOD 4.2602616e-05
10,482 Fast and Scalable Data Transfer Across Data Systems 2025 SIGMOD 4.1945683e-05
10,591 Accio: Bolt-on Query Federation 2025 VLDB 4.1945683e-05
12,005 Design and Implementation of a Real-Time Interactive Analytics System for Large Spatio-Temporal Data 2014 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers