Llama: Leveraging Columnar Storage for Scalable Join Processing in the MapReduce Framework
Summary: Llama uses columnar storage with vertical partitioning via correlation groups to enable scalable join processing on a DFS-based MapReduce engine. A new join algorithm and TPC-H evaluation show faster loading and superior query performance vs Hive on EC2. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Yuting Lin
- 2. Divyakant Agrawal
- 3. Chun Chen
- 4. Beng Chin Ooi
- 5. Sai Wu
Incoming Citations (Sorted by Pagerank)
Showing 11 of 11 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,674 | Minimal MapReduce Algorithms | 2013 | SIGMOD | 8.3328645e-05 |
| 2,998 | Major Technical Advancements in Apache Hive | 2014 | SIGMOD | 7.753765e-05 |
| 3,129 | Scalable Big Graph Processing in MapReduce | 2014 | SIGMOD | 7.5008242e-05 |
| 3,601 | Large-Scale Machine Learning at Twitter | 2012 | SIGMOD | 6.9315087e-05 |
| 4,572 | The Unified Logging Infrastructure for Data Analytics at Twitter | 2012 | VLDB | 6.0760183e-05 |
| 4,573 | Clydesdale: Structured Data Processing on Hadoop | 2012 | SIGMOD | 6.0753788e-05 |
| 6,802 | Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters | 2013 | VLDB | 4.9226626e-05 |
| 7,958 | CARTILAGE: Adding Flexibility to the Hadoop Skeleton | 2013 | SIGMOD | 4.613363e-05 |
| 9,347 | Rank Join Queries in NoSQL Databases | 2014 | VLDB | 4.3526718e-05 |
| 9,375 | Efficient Big Data Processing in Hadoop MapReduce | 2012 | VLDB | 4.347384e-05 |
| 12,006 | YZStack: Provisioning Customizable Solution for Big Data | 2014 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 11 of 11 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 52 | Database Architecture Optimized for the new Bottleneck: Memory Access | 1999 | VLDB | 0.00066474881 |
| 80 | Weaving Relations for Cache Performance | 2001 | VLDB | 0.00055721729 |
| 109 | Dremel: Interactive Analysis of Web-Scale Datasets | 2010 | VLDB | 0.00048186983 |
| 123 | A Decomposition Storage Model | 1985 | SIGMOD | 0.00045255007 |
| 131 | Integrating Compression and Execution in Column-Oriented Database Systems | 2006 | SIGMOD | 0.0004370331 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 960 | A Comparison of Join Algorithms for Log Processing in MapReduce | 2010 | SIGMOD | 0.00015012242 |
| 1,111 | Sybase IQ Multiplex – Designed For Analytics | 2004 | VLDB | 0.00013936696 |
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,615 | The Performance of MapReduce: An In-depth Study | 2010 | VLDB | 0.00011132319 |
| 960 | A Comparison of Join Algorithms for Log Processing in MapReduce | 2010 | SIGMOD | 0.00015012242 |
| 3,141 | ClusterJoin: A Similarity Joins Framework using Map-Reduce | 2014 | VLDB | 7.4829448e-05 |
| 9,347 | Rank Join Queries in NoSQL Databases | 2014 | VLDB | 4.3526718e-05 |
| 1,863 | Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce | 2010 | VLDB | 0.00010286531 |
| 11,890 | Let's Rethink Join Optimization in Distributed Systems | 2015 | CIDR | 4.1945683e-05 |
| 1,206 | Rack-Scale In-Memory Join Processing using RDMA | 2015 | SIGMOD | 0.00013281657 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 3,208 | Column-Oriented Storage Techniques for MapReduce | 2011 | VLDB | 7.3781897e-05 |