On the Efficiency of K-Means Clustering: Evaluation, Optimization, and Algorithm Selection
Summary: UniK unifies pruning-based accelerations for Lloyd's k-means into an evaluation framework with fine-grained performance breakdown. An optimized UniK-hybrid pruning strategy improves efficiency, with ML-based automatic selection of the best accelerator. (summarized by gpt-5-nano on Feb 09 2026)
Incoming Non-self Citations Over Time
Authors
- 1. Sheng Wang
- 2. Yuan Sun
- 3. Zhifeng Bao
Incoming Citations (Sorted by Pagerank)
Showing 5 of 5 citing papers.
| Rank | Citing Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 8,670 | Marigold: Efficient k-means Clustering in High Dimensions | 2023 | VLDB | 4.4715132e-05 |
| 10,317 | Highly-Efficient Large-Scale k-means with Individual Fairness | 2026 | VLDB | 4.1945683e-05 |
| 10,716 | Federated and Balanced Clustering for High-dimensional Data | 2025 | VLDB | 4.1945683e-05 |
| 11,193 | Prerequisite-driven Fair Clustering on Heterogeneous Information Networks | 2023 | SIGMOD | 4.1945683e-05 |
| 11,219 | F3 KM: Federated, Fair, and Fast k-means | 2023 | SIGMOD | 4.1945683e-05 |
Previous
Page 1 / 1
Next
Outgoing Citations (Sorted by Pagerank)
Showing 9 of 9 cited papers.
Citations counted here include only citations to other VLDB/SIGMOD/CIDR/PODS papers in this database.
| Rank | Cited Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 91 | M-tree: An Efficient Access Method for Similarity Search in Metric Spaces | 1997 | VLDB | 0.0005181666 |
| 183 | Automatic Database Management System Tuning Through Large-scale Machine Learning | 2017 | SIGMOD | 0.00036721403 |
| 782 | QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning | 2019 | VLDB | 0.00016729063 |
| 936 | Framework for Evaluating Clustering Algorithms in Duplicate Detection | 2009 | VLDB | 0.0001521549 |
| 2,093 | Scalable K-Means++ | 2012 | VLDB | 9.5588104e-05 |
| 2,635 | NG-DBSCAN: Scalable Density-Based Clustering for Arbitrary Data | 2017 | VLDB | 8.4045788e-05 |
| 4,985 | Pivot-based Metric Indexing | 2017 | VLDB | 5.7856648e-05 |
| 5,508 | Fast Large-Scale Trajectory Clustering | 2020 | VLDB | 5.4713696e-05 |
| 8,168 | Evaluating Clustering in Subspace Projections of High Dimensional Data | 2009 | VLDB | 4.5701004e-05 |
Previous
Page 1 / 1
Next
Semantically Similar Papers
| Overall Rank | Paper | Year | Venue | Pagerank |
|---|---|---|---|---|
| 1,860 | Approximation Algorithms for Clustering Uncertain Data | 2008 | PODS | 0.0001028857 |
| 8,168 | Evaluating Clustering in Subspace Projections of High Dimensional Data | 2009 | VLDB | 4.5701004e-05 |
| 10,924 | Improved Approximation Algorithms for Relational Clustering | 2024 | PODS | 4.1945683e-05 |
| 12,571 | k-Means Projective Clustering | 2004 | PODS | 4.1945683e-05 |
| 10,317 | Highly-Efficient Large-Scale k-means with Individual Fairness | 2026 | VLDB | 4.1945683e-05 |
| 10,971 | Settling Time vs. Accuracy Tradeoffs for Clustering Big Data | 2024 | SIGMOD | 4.1945683e-05 |
| 11,045 | Ensemble Clustering based on Meta-Learning and Hyperparameter Optimization | 2024 | VLDB | 4.1945683e-05 |
| 10,943 | Efficient Algorithm for K-Multiple-Means | 2024 | SIGMOD | 4.1945683e-05 |
| 9,420 | Local Search Methods for k-Means with Outliers | 2017 | VLDB | 4.3441378e-05 |
| 2,093 | Scalable K-Means++ | 2012 | VLDB | 9.5588104e-05 |