Database Paper Browser

Back to papers

LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems

Summary: Fine-grained lineage tracing and reuse in ML systems (LIMA) to break coarse, black-box limits. Multi-level traces, loop/function dedup, and cross-hierarchy reuse enable low-overhead provenance with versioning, compatible with task parallelism and operator fusion, delivering up to 12.4x speedups. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
6069
Venue
SIGMOD
Year
2021
Pagerank
5.9316087e-05
Overall Rank
4,774 | 66.79%
DOI
10.1145/3448016.3452788

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 15 of 15 citing papers.

Rank Citing Paper Year Venue Pagerank
7,306 DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines 2022 CIDR 4.7678574e-05
7,482 Provenance-Enabled Explainable AI 2024 SIGMOD 4.7180617e-05
7,656 Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets 2022 SIGMOD 4.6871575e-05
7,704 ExDRa: Exploratory Data Science on Federated Raw Data 2021 SIGMOD 4.6733838e-05
8,092 Saga: A Scalable Framework for Optimizing Data Cleaning Pipelines for Machine Learning Applications 2023 SIGMOD 4.587921e-05
8,514 UPLIFT: Parallelization Strategies for Feature Transformations in Machine Learning Workloads 2022 VLDB 4.4944285e-05
9,806 The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format 2024 SIGMOD 4.2805224e-05
9,912 ElasticNotebook: Enabling Live Migration for Computational Notebooks 2024 VLDB 4.2565279e-05
10,252 CAPS: Cost-Aware ML Pipeline Selection 2026 VLDB 4.1945683e-05
10,291 Morphing-based Compression for Data-centric ML Pipelines 2026 VLDB 4.1945683e-05
10,419 Unified Lineage System: Tracking Data Provenance at Scale 2025 SIGMOD 4.1945683e-05
10,469 Alsatian: Optimizing Model Search for Deep Transfer Learning 2025 SIGMOD 4.1945683e-05
10,628 CatDB: Data-catalog-guided, LLM-based Generation of Data-centric ML Pipelines 2025 VLDB 4.1945683e-05
10,842 ML-Asset Management: Curation, Discovery, and Utilization 2025 VLDB 4.1945683e-05
11,339 Redundancy Elimination in Distributed Matrix Computation 2022 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 4 of 54 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Previous Page 2 / 2 Next

Semantically Similar Papers