Gobblin: Unifying Data Ingestion for Hadoop
Summary: Gobblin unifies Hadoop data ingestion into a single, extensible framework. Out-of-the-box support for relational, NoSQL, streaming, REST, and file sources; emphasizes generality, extensibility, operability, and end-to-end production metrics. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Lin Qiao
- 2. Yinan Li
- 3. Sahil Takiar
- 4. Ziyang Liu
- 5. Narasimha Veeramreddy
- 6. Min Tu
- 7. Ying Dai
- 8. Issac Buenrostro
- 9. Kapil Surlaker
- 10. Shirshanka Das
- 11. Chavdar Botev
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 11,802 | Query-able Kafka: An agile data analytics pipeline for mobile wireless networks | 2017 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 3 of 3 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 476 | Impala: A Modern, Open-Source SQL Engine for Hadoop | 2015 | CIDR | 0.00022226941 |
| 1,853 | On Brewing Fresh Espresso: LinkedIn’s Distributed Data Serving Platform | 2013 | SIGMOD | 0.00010320369 |
| 1,863 | Cheetah: A High Performance, Custom Data Warehouse on Top of MapReduce | 2010 | VLDB | 0.00010286531 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,973 | Apache Hive: From MapReduce to Enterprise-grade Big Data Warehousing | 2019 | SIGMOD | 6.5758017e-05 |
| 9,361 | An IDEA: An Ingestion Framework for Data Enrichment in AsterixDB | 2019 | VLDB | 4.3506168e-05 |
| 2,439 | CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop | 2011 | VLDB | 8.8190594e-05 |
| 6,123 | Data Ingestion for the Connected World | 2017 | CIDR | 5.1991194e-05 |
| 6,856 | Liquid: Unifying Nearline and Offline Big Data Integration | 2015 | CIDR | 4.9060615e-05 |
| 4,572 | The Unified Logging Infrastructure for Data Analytics at Twitter | 2012 | VLDB | 6.0760183e-05 |
| 8,108 | Execution Primitives for Scalable Joins and Aggregations in Map Reduce | 2014 | VLDB | 4.5846987e-05 |
| 70 | Hive - A Warehousing Solution Over a Map-Reduce Framework | 2009 | VLDB | 0.00059533166 |
| 2,658 | Data Warehousing and Analytics Infrastructure at Facebook | 2010 | SIGMOD | 8.3607429e-05 |
| 4,857 | The "Big Data" Ecosystem at LinkedIn | 2013 | SIGMOD | 5.8736144e-05 |