Skip to content

Unexpected Normalization #2601

@FlorianWilhelm

Description

@FlorianWilhelm

In the sklearn.linear_models.base.center_data(...) function when normalize=True and fit_intercept=True is provided, the standard deviation of X is calculated by
X_std = np.sqrt(np.sum(X ** 2, axis=0))
I think it should rather read:
X_std = np.sqrt(np.mean(X ** 2, axis=0))
or is there any special reason why you sum here intead of taking the mean? If you just sum then while X is increased in dimension, X_std will grow also to infinity. This seems odd to me and quite unexpected.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions