-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
Dear all,
I have been encountering the issue mentioned in the description for a while and as the 10533 where the issue is discussed does not present a clear solution for Windows, I decided to post it here.
Firstly, the piece of code that runs in two the issue is the following:
qrf = RandomForestQuantileRegressor(n_jobs=-1)
parameters = {'n_estimators' : [30,40], # More estimators will always give a better result
'criterion': ['mae'],
'min_samples_split': [5,10,15],
'max_features' : [Xtrain.shape[1]//3], # For regression.
'verbose' : [50],
'random_state' : [0] # For reproducability of results.
}
custom_cv = custom_cv_2folds(ntrain=Xtrain.shape[0],
nvalid=Xvalid.shape[0])
qrf_grid = GridSearchCV(qrf,
param_grid=parameters,
cv=custom_cv,
n_jobs=-1)
qrf_grid.fit(Xtrainvalid, ytrainvalid)
qrf_best = qrf_grid.best_estimator_ # We save the best estimator.`
where custom_cv
is the following cross-validation split:
def custom_cv_2folds(ntrain,nvalid): # Indices for the training and validation partitions are yield. idx_train = np.arange(0,ntrain,dtype=int) idx_valid = np.arange(ntrain,ntrain+nvalid,dtype=int) yield idx_train, idx_valid
In addition, I think there is no need to provide the data used since there are other issues similar to this one and they do not appear to be data-related.
As the issue has been going on for a while, how it appears to me has changed slightly, thus I will summarize its behaviour in each stage its evolution:
-
At the beginning, I was just running the provided code. The cell in the Jupyter notebook I was executing hanged forever. Shutting down the kernel and restarting it manually was unuseful since the task administrator show a 100% usage of the CPU after doing so and the only way to bring my computer back to normality was by rebooting it. In addition to that, sometimes the piece of code provided above might run in case the Grid was smaller. An example of a
GridSearchCV
that does run successfully:
parameters = {'n_estimators' : list(range(10,40,10)), 'criterion': ['mae'], 'min_samples_split': list(range(2,11,4)), 'max_features' : [Xtrain.shape[1]//3], 'verbose' : [50], 'random_state' : [0] }
I thinkn_estimators
hyper-parameter plays a great role in this. -
In this second stage, I was reading about the issue and happened to come across the 10533. In which it was recommended to install
cloud pickle
and add the following to my code:
%env LOKY_PICKLER='cloudpickle' import multiprocessing multiprocessing.set_start_method('forkserver')
which I did.
However, as I am not using Linux (Windows instead), there is no way for me to use that method and set theforkserver
.
What I did instead, was trying to run my code again, and turned out that now, that the verbose output providedUsing LokyBackend
and the following message showed up:
FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: ValueError: buffer source array is read-only
I started looking into it and tried different stuff using parallel_backend
Now, my code runs with with parallel_backend('threading')
. However, the executing of the trees happened to be unsorted unlike was happening in situations in which the GridSearchCV
was successfully executed. For instance:
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 8 concurrent workers. building tree 1 of 30building tree 1 of 40 building tree 2 of 30 building tree 2 of 40 building tree 3 of 30 building tree 3 of 40 building tree 4 of 30 building tree 4 of 40 building tree 5 of 40 building tree 5 of 30 building tree 1 of 30building tree 6 of 40 building tree 6 of 30building tree 2 of 30 building tree 7 of 40
I cannot say if the fit
has finished successfully since the training process takes quite a lot, but I will add that information as soon as it finishes.
My questions now are:
- Is there a reason why this is happening?
- Is there a workaround for Windows users?
- I am rather concerned about the disordered executing of the trees during the training process. Do you know why is this happening? Do you think will it be unbeneficial for the model results?
Thanks a lot in advance, and sorry for the long post, but I added all information which I considered relevant so that the investigation had no lack of information.
In case you need me to provide any additional details, just let me know.
Óscar