Database Paper Browser

Back to papers

Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities

Summary: Analyzes 3,000 production ML pipelines at Google via provenance graphs and 450k trainings to characterize lifespan, topology, and complexity. Introduces model graphlets, a data model for repeated components, and shows optimization opportunities—pruning wasted computation can cut costs by ~50% without delaying deployment. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6255
Venue
SIGMOD
Year
2021
Pagerank
8.7733773e-05
Overall Rank
2,456 | 82.92%
DOI
10.1145/3448016.3457566

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 12 of 12 citing papers.

Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 14 of 14 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers