Skip to content

[Changed behavior] n_iter_no_change should be attached with early_stopping, not model #19743

Closed
@IchiruTake

Description

@IchiruTake

I have used Neural Network to validate the result. Surprisingly, n_iter_no_change is attached directly into the model instead, although it followed the docmuentation, but get confused for this hyper-parameter. The data is performed on AND gate. This happens on version 0.22 -> 0.24.
Solution: Changed n_iter_no_change so that this hyper-parameter works only if early_stopping=True

import numpy as np
from time import time

X = np.array([[0, 0, 0, 0], [0, 0, 0, 1], 
              [0, 0, 1, 0], [0, 0, 1, 1],
              [0, 1, 0, 0], [0, 1, 0, 1],
              [0, 1, 1, 0], [0, 1, 1, 1],
              [1, 0, 0, 0], [1, 0, 0, 1],
              [1, 0, 1, 0], [1, 0, 1, 1],
              [1, 1, 0, 0], [1, 1, 0, 1],
              [1, 1, 1, 0], [1, 1, 1, 1],], dtype=np.uint8)
Y = np.array([[0]] * 15 + [[1]], dtype=np.uint8)

CASE #1:

import sklearn
from sklearn.neural_network import MLPRegressor, MLPClassifier

print(sklearn.__version__)
start = time()
model = MLPRegressor(hidden_layer_sizes=(16, 4, ), activation='logistic', solver='adam', max_iter=7500, 
                     shuffle=False, n_iter_no_change=10, learning_rate_init=0.001, nesterovs_momentum=True)

model.fit(X, Y.ravel())
pred = model.predict(X)
print(pred)
print("Executing Time: {:.6f}s".format(time() - start))

CASE #2:

import sklearn
from sklearn.neural_network import MLPRegressor, MLPClassifier

print(sklearn.__version__)
start = time()
model = MLPRegressor(hidden_layer_sizes=(16, 4, ), activation='logistic', solver='adam', max_iter=7500, 
                     shuffle=False, n_iter_no_change=7500, learning_rate_init=0.001, nesterovs_momentum=True)

model.fit(X, Y.ravel())
pred = model.predict(X)
print(pred)
print("Executing Time: {:.6f}s".format(time() - start))

Result for Case #1:

pred = [0.14804349 0.13967124 0.14385876 0.13574606 0.1296776  0.12198994
 0.12610933 0.11865247 0.13068479 0.12283299 0.12688654 0.11929599
 0.11350414 0.10649617 0.11028053 0.10350576]

Result for Case #2:

pred = [ 1.55467171e-02  5.70094411e-04  3.03478798e-03 -6.76589593e-03
  1.42867468e-03 -7.64987634e-03 -7.58148435e-03  5.68008762e-03
 -7.20951138e-03 -3.52291628e-03 -3.47743642e-03  5.96549015e-03
 -5.02314026e-03  7.11164102e-03  7.02257007e-03  9.94591169e-01]

Note: The speed compared on Neural Net was better compared to 0.22
Version 0.22.post1: Executing Time: 2.344778s
Version 0.24.1: Executing Time: 2.290114s (2.38 % better)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions