Database Paper Browser

Back to papers

Cerebro: A Data System for Optimized Deep Learning Model Selection

Summary: Cerebro is a data system for optimized deep-learning model selection, boosting throughput and reproducibility at lower cost. Model hopper parallelism, a hybrid task/data-parallel SGD, yields 3–10x speedups and memory/network savings across varied resources. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
12106
Venue
VLDB
Year
2020
Pagerank
0.00018195476
Overall Rank
683 | 95.26%
DOI
10.14778/3407790.3407816

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 26 of 26 citing papers.

Rank Citing Paper Year Venue Pagerank
1,504 Analyzing and Mitigating Data Stalls in DNN Training 2021 VLDB 0.00011642333
1,940 SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging 2021 SIGMOD 0.00010020173
2,839 VolcanoML: Speeding up End-to-End AutoML via Scalable Search Space Decomposition 2021 VLDB 8.0378978e-05
4,557 Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches 2021 VLDB 6.087611e-05
4,957 Doing More with Less: Characterizing Dataset Downsampling for AutoML 2021 VLDB 5.8035715e-05
5,084 In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle 2022 SIGMOD 5.7091191e-05
6,377 Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism 2023 VLDB 5.0911095e-05
6,884 Lotan: Bridging the Gap between GNNs and Scalable Graph Analytics Engines 2023 VLDB 4.8955332e-05
7,656 Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets 2022 SIGMOD 4.6871575e-05
8,092 Saga: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications 2023 SIGMOD 4.587921e-05
8,182 SHiFT: An Efficient, Flexible Search Engine for Transfer Learning 2023 VLDB 4.5659133e-05
8,514 UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads 2022 VLDB 4.4944285e-05
8,735 TensorSocket: Shared Data Loading for Deep Learning Training 2026 SIGMOD 4.456315e-05
8,864 Cerebro: A Layered Data Platform for Scalable Deep Learning 2021 CIDR 4.4326439e-05
9,192 Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale 2022 VLDB 4.3765131e-05
9,222 Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning 2021 VLDB 4.3698672e-05
9,223 Intermittent Human-in-the-Loop Model Selection using Cerebro: A Demonstration 2021 VLDB 4.3698672e-05
9,265 COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression 2022 VLDB 4.3667558e-05
9,806 The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format 2024 SIGMOD 4.2805224e-05
10,842 ML-Asset Management: Curation, Discovery, and Utilization 2025 VLDB 4.1945683e-05
10,998 Database Native Model Selection: Harnessing Deep Neural Networks in Database Systems 2024 VLDB 4.1945683e-05
11,339 Redundancy Elimination in Distributed Matrix Computation 2022 SIGMOD 4.1945683e-05
11,431 Ease.ML: A Lifecycle Management System for MLDev and MLOps 2021 CIDR 4.1945683e-05
11,447 Grouped Learning: Group-By Model Selection Workloads 2021 SIGMOD 4.1945683e-05
13,171 Reimagining Deep Learning Systems Through the Lens of Data Systems 2024 VLDB -
13,271 Errata for “Cerebro: A Data System for Optimized Deep Learning Model Selection” 2021 VLDB -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 14 of 14 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 1 / 1 Next

Semantically Similar Papers