Clydesdale: Structured Data Processing on Hadoop
Summary: Clydesdale, a Hadoop-based prototype for structured data processing, achieves major performance gains without changing MapReduce. By fusing DB techniques with Hadoop and exposing ClyQL, a Scala DSL for star-joins, it delivers ~38x faster star-schema workloads than Hive. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Andrey Balmin
- 2. Tim Kaldewey
- 3. Sandeep Tata
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,247 | Can the Elephants Handle the NoSQL Onslaught? | 2012 | VLDB | 7.3260831e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 21 | C-Store: A Column-oriented DBMS | 2005 | VLDB | 0.00086087497 |
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 52 | Database Architecture Optimized for the new Bottleneck: Memory Access | 1999 | VLDB | 0.00066474881 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 714 | Adaptive Aggregation on Chip Multiprocessors | 2007 | VLDB | 0.00017730584 |
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 3,115 | Llama: Leveraging Columnar Storage for Scalable Join Processing in the MapReduce Framework | 2011 | SIGMOD | 7.543505e-05 |
| 3,208 | Column-Oriented Storage Techniques for MapReduce | 2011 | VLDB | 7.3781897e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 7,958 | CARTILAGE: Adding Flexibility to the Hadoop Skeleton | 2013 | SIGMOD | 4.613363e-05 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 11,972 | Palette: Enabling Scalable Analytics for Big-Memory, Multicore Machines | 2014 | SIGMOD | 4.1945683e-05 |
| 2,127 | SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures | 2014 | VLDB | 9.4863172e-05 |
| 5,838 | HadoopDB in Action: Building Real World Applications | 2010 | SIGMOD | 5.3059032e-05 |
| 5,741 | A Demonstration of ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data | 2017 | VLDB | 5.3451775e-05 |
| 9,375 | Efficient Big Data Processing in Hadoop MapReduce | 2012 | VLDB | 4.347384e-05 |
| 9,692 | GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive | 2022 | SIGMOD | 4.302852e-05 |
| 11,948 | Tutorial: SQL-on-Hadoop Systems | 2015 | VLDB | 4.1945683e-05 |