TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems
Summary: Introduces TPCx-AI, the first industry-standard end-to-end ML benchmark that models real ML pipelines (data integration, processing, training, inference) with both Python and Spark implementations. Novelty: unified structured+unstructured dataset, PB-scale data generator, and diverse representative workloads enabling fair, scalable system comparisons. (summarized by gpt-5-mini on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
Incoming Citations (Sorted by Pagerank)
Showing 7 of 7 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 5,567 | Optimizing Data Pipelines for Machine Learning in Feature Stores | 2023 | VLDB | 5.4305348e-05 |
| 6,378 | Mitigating the Impedance Mismatch between Prediction Query Execution and Database Engine | 2025 | SIGMOD | 5.0909804e-05 |
| 6,895 | Decentralized Actor Scheduling and Reference-based Storage in Xorbits: a Native Scalable Data Science Engine | 2025 | VLDB | 4.8925595e-05 |
| 9,236 | The Hopsworks Feature Store for Machine Learning | 2024 | SIGMOD | 4.3690661e-05 |
| 10,095 | NeurStore: Efficient In-database Deep Learning Model Management System | 2026 | SIGMOD | 4.1945683e-05 |
| 10,243 | TPCx-AI under the Microscope: A Benchmarking Debt Analysis | 2026 | VLDB | 4.1945683e-05 |
| 10,320 | ELT-Bench: An End-to-End Benchmark for Evaluating AI Agents on ELT Pipelines | 2026 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 7 of 7 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 659 | The Making of TPC-DS | 2006 | VLDB | 0.00018500853 |
| 1,420 | Data Management Challenges in Production Machine Learning | 2017 | SIGMOD | 0.00012057956 |
| 1,727 | BigBench: Towards an Industry Standard Benchmark for Big Data Analytics | 2013 | SIGMOD | 0.00010740936 |
| 2,456 | Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities | 2021 | SIGMOD | 8.7733773e-05 |
| 3,948 | A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics | 2018 | VLDB | 6.5959084e-05 |
| 5,114 | TPC-DI: The First Industry Benchmark for Data Integration | 2014 | VLDB | 5.6863051e-05 |
| 7,420 | MLBench: Benchmarking Machine Learning Services Against Human Experts | 2018 | VLDB | 4.7347751e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,951 | Why You Should Run TPC-DS: A Workload Analysis | 2007 | VLDB | 6.5953162e-05 |
| 10,707 | PBench: Workload Synthesizer with Real Statistics for Cloud Analytics Benchmarking | 2025 | VLDB | 4.1945683e-05 |
| 9,477 | Revisiting Graph Analytics Benchmark | 2025 | SIGMOD | 4.3341665e-05 |
| 1,727 | BigBench: Towards an Industry Standard Benchmark for Big Data Analytics | 2013 | SIGMOD | 0.00010740936 |
| 9,364 | FEBench: A Benchmark for Real-Time Relational Data Feature Extraction | 2023 | VLDB | 4.3502487e-05 |
| 5,114 | TPC-DI: The First Industry Benchmark for Data Integration | 2014 | VLDB | 5.6863051e-05 |
| 3,455 | A Comparison of Platforms for Implementing and Running Very Large Scale Machine Learning Algorithms | 2014 | SIGMOD | 7.0771839e-05 |
| 7,420 | MLBench: Benchmarking Machine Learning Services Against Human Experts | 2018 | VLDB | 4.7347751e-05 |
| 3,254 | Query Processing on Tensor Computation Runtimes | 2022 | VLDB | 7.3161051e-05 |
| 10,243 | TPCx-AI under the Microscope: A Benchmarking Debt Analysis | 2026 | VLDB | 4.1945683e-05 |