BigBench: Towards an Industry Standard Benchmark for Big Data Analytics
Summary: BigBench: an industry-standard end-to-end benchmark for big data analytics. Extends TPC-DS with semi-structured and unstructured data (clicks, reviews), adds a scalable data generator and a multi-dimensional workload; feasibility shown on Teradata Aster with 200 GB. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Ahmad Ghazal
- 2. Tilmann Rabl
- 3. Minqing Hu
- 4. Francois Raab
- 5. Meikel Poess
- 6. Alain Crolotte
- 7. Hans-Arno Jacobsen
Incoming Citations (Sorted by Pagerank)
Showing 16 of 16 citing papers.
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 8 of 8 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3 | Pig Latin: A Not-So-Foreign Language for Data Processing | 2008 | SIGMOD | 0.0024183614 |
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 98 | XMark: A Benchmark for XML Data Management | 2002 | VLDB | 0.00050023808 |
| 659 | The Making of TPC-DS | 2006 | VLDB | 0.00018500853 |
| 675 | The OO7 Benchmark | 1993 | SIGMOD | 0.00018316592 |
| 1,355 | SQL/MapReduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions | 2009 | VLDB | 0.00012404572 |
| 3,556 | Solving Big Data Challenges for Enterprise Application Performance Management | 2012 | VLDB | 6.9770145e-05 |
| 3,951 | Why You Should Run TPC-DS: A Workload Analysis | 2007 | VLDB | 6.5953162e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,623 | GenBase: A Complex Analytics Genomics Benchmark | 2014 | SIGMOD | 8.4374366e-05 |
| 42 | A Comparison of Approaches to Large-Scale Data Analysis | 2009 | SIGMOD | 0.00073498298 |
| 4,581 | Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation | 2019 | VLDB | 6.0703328e-05 |
| 340 | OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases | 2014 | VLDB | 0.00026841628 |
| 4,517 | Generating Databases for Query Workloads | 2010 | VLDB | 6.1178732e-05 |
| 10,707 | PBench: Workload Synthesizer with Real Statistics for Cloud Analytics Benchmarking | 2025 | VLDB | 4.1945683e-05 |
| 6,234 | Just can't get enough - Synthesizing Big Data | 2015 | SIGMOD | 5.1451686e-05 |
| 7,892 | M2Bench: A Database Benchmark for Multi-Model Analytic Workloads | 2023 | VLDB | 4.6245179e-05 |
| 2,129 | IDEBench: A Benchmark for Interactive Data Exploration | 2020 | SIGMOD | 9.480002e-05 |
| 9,394 | BigVectorBench: Heterogeneous Data Embedding and Compound Queries are Essential in Evaluating Vector Databases | 2025 | VLDB | 4.3441378e-05 |