Back to papers
Privacy and Accuracy-Aware AI/ML Model Deduplication
Summary: Formalizes DP-trained model deduplication, marrying privacy budgets with accuracy guarantees. Greedy base-model assignment and Sparse Vector Technique-based private validation curb storage and privacy costs; yields up to 35x compression and 43x speedup.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 7258
- Venue
- SIGMOD
- Year
- 2025
- Pagerank
- 4.1945683e-05
- Overall Rank
- 10,499 | 26.97%
- DOI
-
10.1145/3725340
Incoming Non-self Citations Over Time
No non-self incoming citations found for this paper in this database.
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 25 of 25 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 67 |
The Merge/Purge Problem for Large Databases |
1995 |
SIGMOD |
0.00061348205 |
| 83 |
Privacy Integrated Queries: An Extensible Platform for Privacy-Preserving Data Analysis |
2009 |
SIGMOD |
0.00053933811 |
| 140 |
The MADlib Analytics Library or MAD Skills, the SQL |
2012 |
VLDB |
0.00042270404 |
| 280 |
Eliminating Fuzzy Duplicates in Data Warehouses |
2002 |
VLDB |
0.00029113044 |
| 557 |
SystemML: Declarative Machine Learning on Spark |
2016 |
VLDB |
0.00020197988 |
| 754 |
Distributed Representations of Tuples for Entity Resolution |
2018 |
VLDB |
0.00017117211 |
| 1,167 |
Learning Generalized Linear Models Over Normalized Data |
2015 |
SIGMOD |
0.00013547713 |
| 1,234 |
Ed-Join: An Efficient Algorithm for Similarity Joins With Edit Distance Constraints |
2008 |
VLDB |
0.00013122499 |
| 1,279 |
Towards Linear Algebra over Normalized Data |
2017 |
VLDB |
0.00012868394 |
| 1,771 |
On Arbitrage-free Pricing for General Data Queries |
2014 |
VLDB |
0.00010617356 |
| 1,891 |
Towards Model-based Pricing for Machine Learning in a Data Marketplace |
2019 |
SIGMOD |
0.00010194092 |
| 2,152 |
MISTIQUE: A System to Store and Query Model Intermediates for Model Diagnosis |
2018 |
SIGMOD |
9.4239787e-05 |
| 2,231 |
Dedoop: Efficient Deduplication with Hadoop |
2012 |
VLDB |
9.2304499e-05 |
| 2,758 |
Understanding the Sparse Vector Technique for Differential Privacy |
2017 |
VLDB |
8.1653216e-05 |
| 3,528 |
Distributed Data Deduplication |
2016 |
VLDB |
7.0066139e-05 |
| 3,640 |
Deep Learning for Blocking in Entity Matching: A Design Space Exploration |
2021 |
VLDB |
6.8891671e-05 |
| 3,836 |
Dealer: An End-to-End Model Marketplace with Differential Privacy |
2021 |
VLDB |
6.7153977e-05 |
| 4,409 |
Declarative Recursive Computation on an RDBMS |
2019 |
VLDB |
6.2104034e-05 |
| 5,821 |
Tensor Relational Algebra for Distributed Machine Learning System Design |
2021 |
VLDB |
5.3134851e-05 |
| 6,380 |
SmartLite: A DBMS-based Serving System for DNN Inference in Resource-constrained Environments |
2024 |
VLDB |
5.0893219e-05 |
| 7,061 |
Serving Deep Learning Models with Deduplication from Relational Databases |
2022 |
VLDB |
4.8463881e-05 |
| 7,417 |
DProvDB: Differentially Private Query Processing with Multi-Analyst Provenance |
2023 |
SIGMOD |
4.7355114e-05 |
| 7,797 |
Quantifying identifiability to choose and audit epsilon in differentially private deep learning |
2021 |
VLDB |
4.6482625e-05 |
| 8,002 |
Pangea: Monolithic Distributed Storage for Data Analytics |
2019 |
VLDB |
4.6088289e-05 |
| 9,708 |
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and Release |
2024 |
VLDB |
4.299267e-05 |
Semantically Similar Papers