Back to papers
ANN Softmax: Acceleration of Extreme Classification Training
Summary: ANN Softmax: GPU-optimized sampling-based softmax for extreme classification with millions of classes, using an inverted-file index and binary quantization to boost recall. GPU kernels plus sample-grouping preserve full-softmax accuracy with 1/10 sampling, achieving 4.3x speedup and enabling 300M-class training on large GPU clusters.
(summarized by gpt-5-nano on Feb 09 2026)
- Paper ID
- 12613
- Venue
- VLDB
- Year
- 2022
- Pagerank
- 4.4626362e-05
- Overall Rank
- 8,712 | 39.40%
- DOI
-
10.14778/3485450.3485451
Incoming Non-self Citations Over Time
Incoming Citations (Sorted by Pagerank)
Showing 1 of 1 citing papers.
Outgoing Citations (Sorted by Pagerank)
Showing 13 of 13 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank |
Cited Paper |
Year |
Venue |
Pagerank |
| 34 |
Similarity Search in High Dimensions via Hashing |
1999 |
VLDB |
0.00076637636 |
| 212 |
Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph |
2019 |
VLDB |
0.00033913475 |
| 400 |
Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search |
2007 |
VLDB |
0.0002427237 |
| 411 |
PyTorch Distributed: Experiences on Accelerating Data Parallel Training |
2020 |
VLDB |
0.00023906921 |
| 513 |
TURL: Table Understanding through Representation Learning |
2021 |
VLDB |
0.00021288342 |
| 1,010 |
HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces |
2018 |
VLDB |
0.00014652858 |
| 1,269 |
Cache locality is not enough: High-Performance Nearest Neighbor Search with Product Quantization Fast Scan |
2016 |
VLDB |
0.00012930432 |
| 1,716 |
Chimera: Large-Scale Classification using Machine Learning, Rules, and Crowdsourcing |
2014 |
VLDB |
0.00010795718 |
| 3,056 |
DSH: Data Sensitive Hashing for High-Dimensional k-NN Search |
2014 |
SIGMOD |
7.6432146e-05 |
| 3,206 |
Panorama: A Data System for Unbounded Vocabulary Querying over Video |
2020 |
VLDB |
7.3826363e-05 |
| 3,363 |
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers |
2019 |
VLDB |
7.1731921e-05 |
| 4,751 |
ODIN: Automated Drift Detection and Recovery in Video Analytics |
2020 |
VLDB |
5.9485403e-05 |
| 8,829 |
A Distributed System for Large-scale n-gram Language Models at Tencent |
2019 |
VLDB |
4.4406886e-05 |
Semantically Similar Papers
| Overall Rank |
Paper |
Year |
Venue |
Pagerank |
| 2,688 |
Accelerating Recommendation System Training by Leveraging Popular Choices |
2022 |
VLDB |
8.2991144e-05 |
| 9,596 |
Scalable Graph Convolutional Network Training on Distributed-Memory Systems |
2023 |
VLDB |
4.319218e-05 |
| 8,808 |
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement |
2023 |
SIGMOD |
4.4454035e-05 |
| 8,375 |
Fast Neural Ranking on Bipartite Graph Indices |
2022 |
VLDB |
4.5326207e-05 |
| 10,528 |
Two Birds with One Stone: Efficient Deep Learning over Mislabeled Data through Subset Selection |
2025 |
SIGMOD |
4.1945683e-05 |
| 7,566 |
ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling |
2023 |
SIGMOD |
4.7089968e-05 |
| 10,042 |
Accelerating High-Dimensional ANN Search via Skipping Redundant Distance Computations |
2026 |
SIGMOD |
4.1945683e-05 |
| 3,293 |
Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics |
2021 |
VLDB |
7.2629834e-05 |
| 5,734 |
Efficient Algorithms for Crowd-Aided Categorization |
2020 |
VLDB |
5.3482904e-05 |
| 10,167 |
FlashANNS: GPU-Driven Asynchronous I/O Pipelining for Eliminating Storage-Compute Bottlenecks in Billion-Scale Similarity Search |
2026 |
SIGMOD |
4.1945683e-05 |