TPCx-AI under the Microscope: A Benchmarking Debt Analysis
Summary: Dissects TPCx-AI for "benchmarking debt": kit/spec divergences, data errors, weak metrics, and workload artifacts that distort what is actually being measured. Shows these issues can skew training/serving by 350x/800x and that fixing them yields up to 3.8x higher end-to-end throughput. (summarized by gpt-5.4-mini on Apr 12 2026)
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Authors
- 1. Ilin Tolovski
- 2. Philipp Hildebrandt
- 3. Khuzaima Daudjee
- 4. Tilmann Rabl
Incoming Citations (Sorted by Pagerank)
Showing 0 of 0 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 5 of 5 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 2,916 | Quantifying TPC-H Choke Points and Their Optimizations | 2020 | VLDB | 7.9068048e-05 |
| 5,567 | Optimizing Data Pipelines for Machine Learning in Feature Stores | 2023 | VLDB | 5.4305348e-05 |
| 5,605 | TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems | 2023 | VLDB | 5.4142007e-05 |
| 6,378 | Mitigating the Impedance Mismatch between Prediction Query Execution and Database Engine | 2025 | SIGMOD | 5.0909804e-05 |
| 6,895 | Decentralized Actor Scheduling and Reference-based Storage in Xorbits: a Native Scalable Data Science Engine | 2025 | VLDB | 4.8925595e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 10,707 | PBench: Workload Synthesizer with Real Statistics for Cloud Analytics Benchmarking | 2025 | VLDB | 4.1945683e-05 |
| 340 | OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases | 2014 | VLDB | 0.00026841628 |
| 14,253 | Database System Performance Measurement | 1986 | SIGMOD | - |
| 4,517 | Generating Databases for Query Workloads | 2010 | VLDB | 6.1178732e-05 |
| 3,951 | Why You Should Run TPC-DS: A Workload Analysis | 2007 | VLDB | 6.5953162e-05 |
| 9,364 | FEBench: A Benchmark for Real-Time Relational Data Feature Extraction | 2023 | VLDB | 4.3502487e-05 |
| 3,254 | Query Processing on Tensor Computation Runtimes | 2022 | VLDB | 7.3161051e-05 |
| 3,178 | Why TPC Is Not Enough: An Analysis of the Amazon Redshift Fleet | 2024 | VLDB | 7.4325992e-05 |
| 8,624 | A Study of Database Performance Sensitivity to Experiment Settings | 2022 | VLDB | 4.483049e-05 |
| 5,605 | TPCx-AI - An Industry Standard Benchmark for Artificial Intelligence and Machine Learning Systems | 2023 | VLDB | 5.4142007e-05 |