Multi-Swap $k$-Means++

Beretta, Lorenzo; Cohen-Addad, Vincent; Lattanzi, Silvio; Parotsidis, Nikos

Computer Science > Computational Geometry

arXiv:2309.16384 (cs)

[Submitted on 28 Sep 2023 (v1), last revised 25 Oct 2024 (this version, v2)]

Title:Multi-Swap $k$-Means++

Authors:Lorenzo Beretta, Vincent Cohen-Addad, Silvio Lattanzi, Nikos Parotsidis

View PDF HTML (experimental)

Abstract:The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective and is known to give an $O(\log k)$-approximation in expectation. To obtain higher quality solutions, Lattanzi and Sohler (ICML 2019) proposed augmenting $k$-means++ with $O(k \log \log k)$ local search steps obtained through the $k$-means++ sampling distribution to yield a $c$-approximation to the $k$-means clustering problem, where $c$ is a large absolute constant. Here we generalize and extend their local search algorithm by considering larger and more sophisticated local search neighborhoods hence allowing to swap multiple centers at the same time. Our algorithm achieves a $9 + \varepsilon$ approximation ratio, which is the best possible for local search. Importantly we show that our approach yields substantial practical improvements, we show significant quality improvements over the approach of Lattanzi and Sohler (ICML 2019) on several datasets.

Comments:	NeurIPS 2023
Subjects:	Computational Geometry (cs.CG); Machine Learning (cs.LG)
Cite as:	arXiv:2309.16384 [cs.CG]
	(or arXiv:2309.16384v2 [cs.CG] for this version)
	https://doi.org/10.48550/arXiv.2309.16384

Submission history

From: Lorenzo Beretta [view email]
[v1] Thu, 28 Sep 2023 12:31:35 UTC (2,168 KB)
[v2] Fri, 25 Oct 2024 18:14:44 UTC (2,169 KB)

Computer Science > Computational Geometry

Title:Multi-Swap $k$-Means++

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computational Geometry

Title:Multi-Swap $k$-Means++

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators