Indexing HDFS Data in PDW: Splitting the data from the index
Summary: Proposes using B+-tree indices in an RDBMS to access data stored in HDFS, effectively splitting storage from indexing. Demonstrates that the PDW-driven approach yields efficient, highly selective query processing by exploiting RDBMS indexing on Hadoop data. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 2 of 2 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,891 | Slalom: Coasting Through Raw Data via Adaptive Partitioning and Indexing | 2017 | VLDB | 6.659442e-05 |
| 8,231 | FusionInsight LibrA: Huawei’s Enterprise Cloud Data Analytics Platform | 2018 | VLDB | 4.5539609e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 4 of 4 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 794 | Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing) | 2010 | VLDB | 0.00016605103 |
| 1,977 | Split Query Processing in Polybase | 2013 | SIGMOD | 9.8824589e-05 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 5,105 | Only Aggressive Elephants are Fast Elephants | 2012 | VLDB | 5.694494e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,347 | Rank Join Queries in NoSQL Databases | 2014 | VLDB | 4.3526718e-05 |
| 4,217 | Spatial Partitioning Techniques in SpatialHadoop | 2015 | VLDB | 6.3514771e-05 |
| 5,558 | A Hadoop Based Distributed Loading Approach to Parallel Data Warehouses | 2011 | SIGMOD | 5.4341353e-05 |
| 3,517 | Integrating Hadoop and Parallel DBMS | 2010 | SIGMOD | 7.0199423e-05 |
| 12,564 | Efficiently Processing Queries on Interval-and-Value Tuples in Relational Databases | 2005 | VLDB | 4.1945683e-05 |
| 7,263 | H2 RDF+: An Efficient Data Management System for Big RDF Graphs | 2014 | SIGMOD | 4.7851876e-05 |
| 773 | Multi-Dimensional Database Allocation for Parallel Data Warehouses | 2000 | VLDB | 0.00016870159 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 2,337 | Efficient Processing of Data Warehousing Queries in a Split Execution Environment | 2011 | SIGMOD | 9.0098186e-05 |
| 1,977 | Split Query Processing in Polybase | 2013 | SIGMOD | 9.8824589e-05 |