| 42 |
A Comparison of Approaches to Large-Scale Data Analysis |
2009 |
SIGMOD |
0.00073498298 |
| 70 |
Hive - A Warehousing Solution Over a Map-Reduce Framework |
2009 |
VLDB |
0.00059533166 |
| 109 |
Dremel: Interactive Analysis of Web-Scale Datasets |
2010 |
VLDB |
0.00048186983 |
| 157 |
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads |
2009 |
VLDB |
0.00040397359 |
| 329 |
Accelerating Machine Learning Inference with Probabilistic Predicates |
2018 |
SIGMOD |
0.00027249545 |
| 542 |
Shark: SQL and Rich Analytics at Scale |
2013 |
SIGMOD |
0.00020595648 |
| 660 |
Large Graph Processing in the Cloud |
2010 |
SIGMOD |
0.00018493984 |
| 711 |
A Case for A Collaborative Query Management System |
2009 |
CIDR |
0.00017751589 |
| 780 |
Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience |
2009 |
VLDB |
0.00016775082 |
| 794 |
Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) |
2010 |
VLDB |
0.00016605103 |
| 886 |
Fast Personalized PageRank on MapReduce |
2011 |
SIGMOD |
0.00015597161 |
| 913 |
Tenzing A SQL Implementation On The MapReduce Framework |
2011 |
VLDB |
0.00015408131 |
| 947 |
MRShare: Sharing Across Multiple Queries in MapReduce |
2010 |
VLDB |
0.00015114576 |
| 960 |
A Comparison of Join Algorithms for Log Processing in MapReduce |
2010 |
SIGMOD |
0.00015012242 |
| 1,098 |
Trill: A High-Performance Incremental Query Processor for Diverse Analytics |
2015 |
VLDB |
0.00014114442 |
| 1,110 |
Parallel Evaluation of Conjunctive Queries |
2011 |
PODS |
0.00013968198 |
| 1,158 |
Simulation of Database-Valued Markov Chains Using SimSQL |
2013 |
SIGMOD |
0.0001361064 |
| 1,261 |
Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce |
2013 |
VLDB |
0.00012989236 |
| 1,265 |
Jaql: A Scripting Language for Large Scale Semistructured Data Analysis |
2011 |
VLDB |
0.00012947629 |
| 1,323 |
Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters |
2016 |
SIGMOD |
0.00012601997 |
| 1,355 |
SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions |
2009 |
VLDB |
0.00012404572 |
| 1,534 |
PerfXplain: Debugging MapReduce Job Performance |
2012 |
VLDB |
0.00011468393 |
| 1,574 |
Approximate Query Processing: No Silver Bullet |
2017 |
SIGMOD |
0.00011287495 |
| 1,615 |
The Performance of MapReduce: An In-depth Study |
2010 |
VLDB |
0.00011132319 |
| 1,721 |
Distributed Data-Parallel Computing Using a High-Level Programming Language |
2009 |
SIGMOD |
0.00010762918 |
| 1,846 |
Combining User Interaction, Speculative Query Execution and Sampling in the DICE System |
2014 |
VLDB |
0.00010335419 |
| 1,873 |
An Architecture for Compiling UDF-centric Workflows |
2015 |
VLDB |
0.00010253002 |
| 2,083 |
Towards a Learning Optimizer for Shared Clouds |
2019 |
VLDB |
9.5834572e-05 |
| 2,172 |
Spinning Fast Iterative Data Flows |
2012 |
VLDB |
9.3706587e-05 |
| 2,249 |
Orca: A Modular Query Optimizer Architecture for Big Data |
2014 |
SIGMOD |
9.2034693e-05 |
| 2,413 |
Automated Partitioning Design in Parallel Database Systems |
2011 |
SIGMOD |
8.8672223e-05 |
| 2,418 |
Tupleware: "Big" Data, Big Analytics, Small Clusters |
2015 |
CIDR |
8.8556595e-05 |
| 2,476 |
A Platform for Scalable One-Pass Analytics using MapReduce |
2011 |
SIGMOD |
8.6960139e-05 |
| 2,545 |
POLARIS: The Distributed SQL Engine in Azure Synapse |
2020 |
VLDB |
8.5725413e-05 |
| 2,611 |
Opening the Black Boxes in Data Flow Optimization |
2012 |
VLDB |
8.4536967e-05 |
| 2,674 |
Minimal MapReduce Algorithms |
2013 |
SIGMOD |
8.3328645e-05 |
| 2,817 |
Recurring Job Optimization in Scope |
2012 |
SIGMOD |
8.0677653e-05 |
| 2,818 |
Implicit Parallelism through Deep Language Embedding |
2015 |
SIGMOD |
8.0665558e-05 |
| 3,038 |
Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics |
2017 |
SIGMOD |
7.6717218e-05 |
| 3,141 |
ClusterJoin: A Similarity Joins Framework using Map-Reduce |
2014 |
VLDB |
7.4829448e-05 |
| 3,348 |
Lero: A Learning-to-Rank Query Optimizer |
2023 |
VLDB |
7.1904529e-05 |
| 3,517 |
Integrating Hadoop and Parallel DBMS |
2010 |
SIGMOD |
7.0199423e-05 |
| 3,550 |
Chi: A Scalable and Programmable Control Plane for Distributed Stream Processing Systems |
2018 |
VLDB |
6.9843512e-05 |
| 3,625 |
Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings |
2020 |
SIGMOD |
6.9055212e-05 |
| 3,922 |
Pushing Data-Induced Predicates Through Joins in Big-Data Clusters |
2020 |
VLDB |
6.6291079e-05 |
| 4,061 |
Advanced Partitioning Techniques for Massively Distributed Computation |
2012 |
SIGMOD |
6.483587e-05 |
| 4,174 |
Computation Reuse in Analytics Job Service at Microsoft |
2018 |
SIGMOD |
6.3856219e-05 |
| 4,248 |
Hyper Dimension Shuffle: Efficient Data Repartition at Petabyte Scale in SCOPE |
2019 |
VLDB |
6.3247927e-05 |
| 4,689 |
Algorithmic Aspects of Parallel Query Processing |
2018 |
SIGMOD |
5.9980099e-05 |
| 4,690 |
Deploying a Steered Query Optimizer in Production at Microsoft |
2022 |
SIGMOD |
5.997226e-05 |