-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
@scikit-learn/core-devs following the irl discussion from the meeting.
threadpoolctl
is needed in the new implementation of KMeans (#11950) to prevent oversubscription due to nested BLAS calls inside an outer OpenMP loop. It's a single file pure python package: https://github.com/joblib/threadpoolctl.
-
vendor
The easiest and quickest way would be to vendor it in scikit-learn. There's even a PR ready for that ([MRG] Vendor threadpoolctl #14980). However it adds yet another thing in externals :(. It means that a bug fix in threadpoolctl would not be available until a new release of scikit-learn. -
dependency
On the other hand, we can make it a dependency. For that we need it to be available on conda (conda-forge and default channel). Also it might be a bit overkill for a single call to threadpoolctl in all scikit-learn.
What are your thoughts about that ? (Feel free to edit to add more pros and cons)