Database Paper Browser

Back to papers

SketchML: Accelerating Distributed Machine Learning with Data Sketches

Summary: SketchML uses data sketches to compress distributed SGD gradients, targeting sparse gradients. Key ideas: quantile-sketch bucketization, MinMaxSketch collision-resolving hash tables, and delta-binary encoding; shows error bounds and 10x speedups on Tencent. (summarized by gpt-5-nano on Feb 09 2026)

Paper ID
5540
Venue
SIGMOD
Year
2018
Pagerank
6.7455428e-05
Overall Rank
3,808 | 73.51%
DOI
10.1145/3183713.3196894

Incoming Non-self Citations Over Time

Authors

Incoming Citations (Sorted by Pagerank)

Showing 14 of 14 citing papers.

Rank Citing Paper Year Venue Pagerank
2,643 Camel: Managing Data for Efficient Stream Learning 2022 SIGMOD 8.384956e-05
2,791 Towards Demystifying Serverless Machine Learning Training 2021 SIGMOD 8.1206618e-05
3,506 BlindFL: Vertical Federated Machine Learning without Peeking into Your Data 2022 SIGMOD 7.0291192e-05
3,751 BurstSketch: Finding Bursts in Data Streams 2021 SIGMOD 6.7888099e-05
5,333 Heterogeneity-Aware Distributed Machine Learning Training via Partial Reduce 2021 SIGMOD 5.5656575e-05
6,738 Agile and Accurate CTR Prediction Model Training for Massive-Scale Online Advertising Systems 2021 SIGMOD 4.9452647e-05
7,536 Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent 2023 VLDB 4.7176331e-05
8,786 AWARE: Workload-aware, Redundancy-exploiting Linear Algebra 2023 SIGMOD 4.4521262e-05
8,829 A Distributed System for Large-scale n-gram Language Models at Tencent 2019 VLDB 4.4406886e-05
9,041 TreeSensing: Linearly Compressing Sketches with Flexibility 2023 SIGMOD 4.4039656e-05
9,966 Towards Communication-efficient Vertical Federated Learning Training via Cache-enabled Local Updates 2022 VLDB 4.2269436e-05
10,315 CounterSnake: A lossless and generalized compression framework for diverse sketches 2026 VLDB 4.1945683e-05
10,492 Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization 2025 SIGMOD 4.1945683e-05
11,364 MinMax Sampling: A Near-optimal Global Summary for Aggregation in the Wide Area 2022 SIGMOD 4.1945683e-05
Previous Page 1 / 1 Next

Outgoing Citations (Sorted by Pagerank)

Showing 5 of 5 cited papers.

Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.

Rank Cited Paper Year Venue Pagerank
126 Space-Efficient Online Computation of Quantile Summaries 2001 SIGMOD 0.00044744986
1,044 DimmWitted: A Study of Main-Memory Statistical Analytics 2014 VLDB 0.00014475229
1,942 Heterogeneity-aware Distributed Parameter Servers 2017 SIGMOD 0.00010012691
4,439 TencentRec: Real-time Stream Recommendation in Practice 2015 SIGMOD 6.1885354e-05
11,795 LDA*: A Robust and Large-scale Topic Modeling System 2017 VLDB 4.1945683e-05
Previous Page 1 / 1 Next

Semantically Similar Papers