Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent
Summary: Proposes tuple-oriented compression (TOC) for mini-batch SGD, preserving tuple boundaries while using LZW-inspired coding. Enables compressed-domain matrix operations on TOC, delivering up to 51x compression and 10.2x speedups for MGD workloads with no decompression overhead. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Fengan Li
- 2. Lingjiao Chen
- 3. Yijing Zeng
- 4. Arun Kumar
- 5. Xi Wu
- 6. Jeffrey F. Naughton
- 7. Jignesh M. Patel
Incoming Citations (Sorted by Pagerank)
Showing 8 of 8 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 683 | Cerebro: A Data System for Optimized Deep Learning Model Selection | 2020 | VLDB | 0.00018195476 |
| 1,940 | SliceLine: Fast, Linear-Algebra-based Slice Finding for ML Model Debugging | 2021 | SIGMOD | 0.00010020173 |
| 4,557 | Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches | 2021 | VLDB | 6.087611e-05 |
| 7,704 | ExDRa: Exploratory Data Science on Federated Raw Data | 2021 | SIGMOD | 4.6733838e-05 |
| 8,786 | AWARE: Workload-aware, Redundancy-exploiting Linear Algebra | 2023 | SIGMOD | 4.4521262e-05 |
| 8,864 | Cerebro: A Layered Data Platform for Scalable Deep Learning | 2021 | CIDR | 4.4326439e-05 |
| 10,286 | QStore: Quantization-Aware Compressed Model Storage | 2026 | VLDB | 4.1945683e-05 |
| 10,291 | Morphing-based Compression for Data-centric ML Pipelines | 2026 | VLDB | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 18 of 18 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 3,808 | SketchML: Accelerating Distributed Machine Learning with Data Sketches | 2018 | SIGMOD | 6.7455428e-05 |
| 8,786 | AWARE: Workload-aware, Redundancy-exploiting Linear Algebra | 2023 | SIGMOD | 4.4521262e-05 |
| 3,745 | DeepSqueeze: Deep Semantic Compression for Tabular Data | 2020 | SIGMOD | 6.7926132e-05 |
| 7,429 | CompressDB: Enabling Efficient Compressed Data Direct Processing for Various Databases | 2022 | SIGMOD | 4.7320139e-05 |
| 1,100 | Query Optimization In Compressed Database Systems | 2001 | SIGMOD | 0.00014072277 |
| 11,574 | An Evaluation of Methods of Compressing Doubles | 2020 | SIGMOD | 4.1945683e-05 |
| 9,595 | High-Ratio Compression for Machine-Generated Data | 2023 | SIGMOD | 4.3194469e-05 |
| 9,408 | Experimental Analysis of Large-scale Learnable Vector Storage Compression | 2024 | VLDB | 4.3441378e-05 |
| 8,657 | Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices | 2022 | VLDB | 4.4730648e-05 |
| 1,967 | Compressed Linear Algebra for Large-Scale Machine Learning | 2016 | VLDB | 9.9131712e-05 |