Skip to content

SGDClassifier under/overflow  #3040

Closed
Closed
@worldveil

Description

@worldveil

Example code:

from sklearn.linear_model import SGDClassifier
from sklearn.datasets import load_iris
from sklearn import cross_validation

iris = load_iris()

hyperparameter_choices = [ 

    # some examples, by no means exhausive
    {u'loss': 'modified_huber', u'shuffle': True, u'n_iter': 25.0, 
    u'l1_ratio': 0.5, u'learning_rate': 'constant', u'fit_intercept': 0.0, 
    u'penalty': 'l2', u'alpha': 1000.0, u'eta0': 0.1, u'class_weight': None},

    {u'loss': 'squared_hinge', u'shuffle': True, u'n_iter': 25.0, u'l1_ratio': 0.5, 
    u'learning_rate': 'optimal', u'fit_intercept': 0.0, u'penalty': 'elasticnet', 
    u'alpha': 0.001, u'eta0': 0.1, u'class_weight': None},

    {u'loss': 'squared_hinge', u'shuffle': True, u'n_iter': 100.0, u'l1_ratio': 0.5, 
    u'learning_rate': 'optimal', u'fit_intercept': 0.0, u'penalty': 'l2', u'alpha': 0.001, 
    u'eta0': 0.001, u'class_weight': None}
]

for params in hyperparameter_choices:
    try:
        clf = SGDClassifier(**params)
        scores = cross_validation.cross_val_score(clf, iris.data, iris.target, cv=5)
    except ValueError as ve:
        print "ValueError: %s" % ve

I'm not sure if these are just faulty hyperparameters for an SGD in general. Otherwise it seems to be a numerical stability bug.

The above under/overflow happens when the data is scaled first as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions