Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience
Summary: Pig provides SQL-like data manipulation on MapReduce by building explicit dataflows interleaved with UDFs, compiled to Hadoop jobs. It discusses challenges and compares Pig's performance to hand-tuned MapReduce, showing productivity gains with modest overhead. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 33 of 33 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |
| 22 | SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets | 2008 | VLDB | 0.0008456613 |
| 317 | Distributed Query Processing In A Relational Data Base System | 1978 | SIGMOD | 0.00027980992 |
| 588 | Practical Skew Handling in Parallel Joins | 1992 | VLDB | 0.00019604754 |
| 2,035 | Generating Example Data for Dataflow Programs | 2009 | SIGMOD | 9.7149269e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 70 | Hive - A Warehousing Solution Over a Map-Reduce Framework | 2009 | VLDB | 0.00059533166 |
| 4,857 | The "Big Data" Ecosystem at LinkedIn | 2013 | SIGMOD | 5.8736144e-05 |
| 157 | HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads | 2009 | VLDB | 0.00040397359 |
| 2,476 | A Platform for Scalable One-Pass Analytics using MapReduce | 2011 | SIGMOD | 8.6960139e-05 |
| 3,601 | Large-Scale Machine Learning at Twitter | 2012 | SIGMOD | 6.9315087e-05 |
| 12,125 | ReStore: Reusing Results of MapReduce Jobs in Pig | 2012 | SIGMOD | 4.1945683e-05 |
| 4,425 | Nova: Continuous Pig/Hadoop Workflows | 2011 | SIGMOD | 6.198382e-05 |
| 2,205 | ReStore: Reusing Results of MapReduce Jobs | 2012 | VLDB | 9.2920002e-05 |
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |