Hindsight Logging for Model Training
Summary: Proposes hindsight logging for model training to enable log replay with low overhead. The flor tool suite blends background logging, adaptive checkpoints, and replay instrumentation, guided by DB recovery ideas; yields ~7% overhead and fast replay. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Rolando Garcia
- 2. Eric Liu
- 3. Vikram Sreekanti
- 4. Bobby Yan
- 5. Anusha Dandamudi
- 6. Joseph E. Gonzalez
- 7. Joseph M. Hellerstein
- 8. Koushik Sen
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 9,118 | Towards Observability for Production Machine Learning Pipelines | 2022 | VLDB | 4.3928288e-05 |
| 9,378 | CHEX: Multiversion Replay with Ordered Checkpoints | 2022 | VLDB | 4.3463396e-05 |
| 9,912 | ElasticNotebook: Enabling Live Migration for Computational Notebooks | 2024 | VLDB | 4.2565279e-05 |
| 10,338 | Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle | 2025 | CIDR | 4.1945683e-05 |
| 11,313 | Towards Observability for Machine Learning Pipelines | 2022 | CIDR | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 26 | The Design Of The Postgres Storage System | 1987 | VLDB | 0.00082378685 |
| 419 | Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems | 2015 | SIGMOD | 0.00023720338 |
| 761 | Materialization Optimizations for Feature Selection Workloads | 2014 | SIGMOD | 0.00017053783 |
| 1,666 | HELIX: Holistic Optimization for Accelerating Iterative Machine Learning | 2019 | VLDB | 0.0001096361 |
| 2,037 | OrpheusDB: Bolt-on Versioning for Relational Databases | 2017 | VLDB | 9.7120139e-05 |
| 2,152 | MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis | 2018 | SIGMOD | 9.4239787e-05 |
| 2,163 | Elastic Machine Learning Algorithms in Amazon SageMaker | 2020 | SIGMOD | 9.3949234e-05 |
| 2,269 | Ground: A Data Context Service | 2017 | CIDR | 9.147379e-05 |
| 2,430 | Decibel: The Relational Dataset Branching System | 2016 | VLDB | 8.8330417e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,847 | Towards Foundation Database Models | 2025 | CIDR | 4.4371897e-05 |
| 5,433 | "Amnesia" - A Selection of Machine Learning Models That Can Forget User Data Very Fast | 2020 | CIDR | 5.5051607e-05 |
| 4,003 | Data Platform for Machine Learning | 2019 | SIGMOD | 6.54347e-05 |
| 3,958 | MLog: Towards Declarative In-Database Machine Learning | 2017 | VLDB | 6.5897636e-05 |
| 7,138 | Ease.ml/ci and Ease.ml/meter in Action: Towards Data Management for Statistical Generalization | 2019 | VLDB | 4.8216981e-05 |
| 6,897 | PreLog: A Pre-trained Model for Log Analytics | 2024 | SIGMOD | 4.8925595e-05 |
| 2,456 | Production Machine Learning Pipelines: Empirical Analysis and Optimization Opportunities | 2021 | SIGMOD | 8.7733773e-05 |
| 11,313 | Towards Observability for Machine Learning Pipelines | 2022 | CIDR | 4.1945683e-05 |
| 9,118 | Towards Observability for Production Machine Learning Pipelines | 2022 | VLDB | 4.3928288e-05 |
| 10,338 | Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle | 2025 | CIDR | 4.1945683e-05 |