You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am learning Machine Learning and exploring nested cross-validation.
I don't understand the example given in scikit-learn. The model seems to learn from the whole dataset and the evaluation is not performed on a hold-out set. scikit documentation scikit implementation
# Loop for each trial
for i in range(NUM_TRIALS):
# Choose cross-validation techniques for the inner and outer loops,
# independently of the dataset.
inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)
outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)
# Nested CV with parameter optimization
clf = GridSearchCV(estimator=svm, param_grid=p_grid, cv=inner_cv)
nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)
nested_scores[i] = nested_score.mean()
From what I read in Applied Predictive Modeling from Kuhn & Johnson, the model resulting from the inner loop should be evaluated on the hold-out set of the outer loop and the following post adheres to this point machinelearningmastery blog
As I am far from a Python expert, could you tell me the advantages, drawbacks and purposes of both of these implementations?
I read #21621 but I am not sure if it really answers my question. If it does, let me know and I will try to carefully understand it.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I am learning Machine Learning and exploring nested cross-validation.
I don't understand the example given in scikit-learn. The model seems to learn from the whole dataset and the evaluation is not performed on a hold-out set.
scikit documentation
scikit implementation
From what I read in Applied Predictive Modeling from Kuhn & Johnson, the model resulting from the inner loop should be evaluated on the hold-out set of the outer loop and the following post adheres to this point machinelearningmastery blog
As I am far from a Python expert, could you tell me the advantages, drawbacks and purposes of both of these implementations?
I read #21621 but I am not sure if it really answers my question. If it does, let me know and I will try to carefully understand it.
Beta Was this translation helpful? Give feedback.
All reactions