Open
Description
Describe the workflow you want to enable
Since numpy version 1.17.0, np.random.RandomState
can accept the ._bit_generator
attribute as input in the constructor. This can be a plus for those who use np.random.Generator
in their code and want to use the same bitgenerator with sklearn's estimators. Currently this is not possible, see:
from sklearn.datasets import make_classification
from sklearn.manifold import TSNE
X, y = make_classification(n_samples=150, n_features=5, n_informative=5,
n_redundant=0, n_repeated=0, n_classes=3,
n_clusters_per_class=1,
weights=[0.01, 0.05, 0.94],
class_sep=0.8, random_state=0)
rng = np.random.default_rng(12345)
tsne = TSNE()
# some piece of code here
# then later we use our own rng to set the seed of `tsne`
# notice `_bit_generator` used here, which is compatible with RandomState
tsne.set_params(random_state=rng._bit_generator)
tsne.fit_transform(X, y)
this leads to the error:
File "/home/python3.6/site-packages/sklearn/manifold/_t_sne.py", line 932, in fit_transform
embedding = self._fit(X)
File "/home/python3.6/site-packages/sklearn/manifold/_t_sne.py", line 728, in _fit
random_state = check_random_state(self.random_state)
File "/home/python3.6/site-packages/sklearn/utils/validation.py", line 944, in check_random_state
' instance' % seed)
ValueError: <numpy.random.pcg64.PCG64 object at 0x7ffa3ab471b8> cannot be used to seed a numpy.random.RandomState instance
Describe your proposed solution
I propose we add a conditional in check_random_state
that supports an instance of BitGenerator
, see:
scikit-learn/sklearn/utils/validation.py
Lines 926 to 944 in 2beed55
something like
supported_bitgenerators = {'PCG64', 'SFC64', 'Philox', ...}
def check_random_state(seed):
...
if seed.__class__.__name__ in supported_bitgenerators:
return np.random.RandomState(seed) # should work if numpy>=1.17.0
...
Describe alternatives you've considered, if relevant
I know there is an issue regarding supporting the new numpy Generator interface but I feel this is slightly different since it does not attempt to replace RandomState
.