Database Paper Browser

Back to papers

MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis

Summary: MISTIQUE stores and queries model intermediates for ML diagnosis, spanning traditional pipelines and deep nets. It chooses re-run vs. reuse per query and uses quantization, summarization, dedup to cut storage up to 110× and speed queries up to 390× (ML) / 210× (DL). (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5580
Venue
SIGMOD
Year
2018
Pagerank
9.4239787e-05
Overall Rank
2,152 | 85.04%
DOI
10.1145/3183713.3196934

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 28 of 28 citing papers.

Rank Citing Paper Year Venue Pagerank
1,666 HELIX: Holistic Optimization for Accelerating Iterative Machine Learning 2019 VLDB 0.0001096361
2,122 SystemDS: A Declarative Machine Learning System for the End-to-End Data Science Lifecycle 2020 CIDR 9.4989076e-05
3,995 How Large Language Models Will Disrupt Data Management 2023 VLDB 6.5513237e-05
4,774 LIMA: Fine-grained Lineage Tracing and Reuse in Machine Learning Systems 2021 SIGMOD 5.9316087e-05
5,684 Dagger: A Data (not code) Debugger 2020 CIDR 5.3720749e-05
6,000 DeepEverest: Accelerating Declarative Top-K Queries for Deep Neural Network Interpretation 2022 VLDB 5.2415551e-05
6,053 Optimizing Machine Learning Workloads in Collaborative Environments 2020 SIGMOD 5.2326838e-05
6,247 Optimizing In-memory Database Engine for AI-powered On-line Decision Augmentation Using Persistent Memory 2021 VLDB 5.1389201e-05
6,291 Lightweight Inspection of Data Preprocessing in Native Machine Learning Pipelines 2021 CIDR 5.1269764e-05
6,330 Efficient Construction of Approximate Ad-Hoc ML models Through Materialization and Reuse 2018 VLDB 5.1077416e-05
6,373 DeepBase: Deep Inspection of Neural Networks 2019 SIGMOD 5.0929326e-05
6,469 Materialization and Reuse Optimizations for Production Data Science Pipelines 2022 SIGMOD 5.0519488e-05
6,733 Hindsight Logging for Model Training 2021 VLDB 4.9467666e-05
7,061 Serving Deep Learning Models with Deduplication from Relational Databases 2022 VLDB 4.8463881e-05
7,656 Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets 2022 SIGMOD 4.6871575e-05
7,704 ExDRa: Exploratory Data Science on Federated Raw Data 2021 SIGMOD 4.6733838e-05
8,163 Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science 2021 VLDB 4.5723431e-05
8,346 Deep Learning: Systems and Responsibility 2021 SIGMOD 4.5420668e-05
9,118 Towards Observability for Production Machine Learning Pipelines 2022 VLDB 4.3928288e-05
9,306 Debugging Large-Scale Data Science Pipelines using Dagger 2020 VLDB 4.3572942e-05
9,408 Experimental Analysis of Large-scale Learnable Vector Storage Compression 2024 VLDB 4.3441378e-05
9,912 ElasticNotebook: Enabling Live Migration for Computational Notebooks 2024 VLDB 4.2565279e-05
10,252 CAPS: Cost-Aware ML Pipeline Selection 2026 VLDB 4.1945683e-05
10,338 Flow with FlorDB: Incremental Context Maintenance for the Machine Learning Lifecycle 2025 CIDR 4.1945683e-05
10,499 Privacy and Accuracy-Aware AI/ML Model Deduplication 2025 SIGMOD 4.1945683e-05
11,008 MetaStore: Analyzing Deep Learning Meta-Data at Scale 2024 VLDB 4.1945683e-05
11,313 Towards Observability for Machine Learning Pipelines 2022 CIDR 4.1945683e-05
13,300 DEEM 2019: Workshop on Data Management for End-to-End Machine Learning 2019 SIGMOD -
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 10 of 10 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
21 C-Store: A Column-oriented DBMS 2005 VLDB 0.00086087497
101 ULDBs: Databases with Uncertainty and Lineage 2006 VLDB 0.0004955674
734 The TileDB Array Data Storage Manager 2017 VLDB 0.00017455248
1,413 VisTrails: Visualization meets Data Management 2006 SIGMOD 0.00012121257
1,565 Principles of Dataset Versioning: Exploring the Recreation/Storage Tradeoff 2015 VLDB 0.00011345567
1,967 Compressed Linear Algebra for Large-Scale Machine Learning 2016 VLDB 9.9131712e-05
2,027 Titian: Data Provenance Support in Spark 2016 VLDB 9.7437067e-05
2,037 OrpheusDB: Bolt-on Versioning for Relational Databases 2017 VLDB 9.7120139e-05
2,430 Decibel: The Relational Dataset Branching System 2016 VLDB 8.8330417e-05
3,347 Collaborative Data Analytics with DataHub 2015 VLDB 7.1921364e-05
Previous Page 1 / 1 Next

Semantically Similar Papers