Skip to content

API Deprecate n_iter in favour of max_iter for TSNE #28471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 23, 2024

Conversation

lucyleeow
Copy link
Member

@lucyleeow lucyleeow commented Feb 20, 2024

Reference Issues/PRs

closes #7736
closes #7761 (supercedes)

What does this implement/fix? Explain your changes.

Amends n_iter in favour of max_iter in TSNE because we have a convergence criterion and is more consistent with other estimators.

Any other comments?

I think if we use max_iter we should probably amend n_iter_without_progress to be max_iter_without_progress as well.

@lucyleeow lucyleeow changed the title deprecate n_iter API Deprecate n_iter in favour of max_iter for TSNE Feb 20, 2024
Copy link

github-actions bot commented Feb 20, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c9f6351. Link to the linter CI: here

@@ -617,10 +618,22 @@ class TSNE(ClassNamePrefixFeaturesOutMixin, TransformerMixin, BaseEstimator):
.. versionchanged:: 1.2
The default value changed to `"auto"`.

n_iter : int, default=1000
max_iter : int, default=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the correct thing to do is here. From looking at other deprecations it seems that the docstring mentions the effective default value (I looked at make_sparse_spd_matrix in sklearn/datasets/_samples_generator.py), so like this:

Suggested change
max_iter : int, default=None
max_iter : int, default=1000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes that is much better. I was following #25697 - I think it may be worth updating the docstring of BayesianRidge and ARDRegression even.

Copy link
Member

@betatim betatim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, except for the comment about the docstring where I'm not sure and the linter's complaints need fixing.

@lucyleeow
Copy link
Member Author

lucyleeow commented Feb 20, 2024

@betatim thanks for the review! Do you think it is worthwhile changing n_iter_without_progress to be max_iter_without_progress for consistency (and semantically, it makes more sense, but another deprecation). Happy to do this in this PR.

@lucyleeow
Copy link
Member Author

So the common tests check_estimators_overwrite_params and check_dont_overwrite_parameters fail because I am updating max_iter during fit. It is used too late to do the parameter check then. We could:

  • add xfail (I've done this)
  • add self.max_iter_ attribute and use this in the subsequent functions. This would be more annoying to remove when the time comes. (happy to do this instead, or another option)

@betatim
Copy link
Member

betatim commented Feb 20, 2024

Do you think it is worthwhile changing n_iter_without_progress to be max_iter_without_progress for consistency (and semantically, it makes more sense, but another deprecation). Happy to do this in this PR.

Probably yes. Like you I think it would be nice for consistency, but it would mean more deprecation. However, I think a large fraction of people who use n_iter_without_progress are also using n_iter. So they are already "there" in the code to make a change, this means the cost of changing an additional parameter name is not so high?

Not sure what to recommend. I think I'd implement the change and see what others say.

@jeremiedbb
Copy link
Member

Do you think it is worthwhile changing n_iter_without_progress to be max_iter_without_progress for consistency (and semantically, it makes more sense, but another deprecation). Happy to do this in this PR.

In other estimators it's usually named n_iter_no_change so I wouldn't change to max_iter_without_progress. I think that if we want to change it we should make it consistent with other estimators. I doesn't need to be part of this PR though.

@lucyleeow
Copy link
Member Author

Will leave n_iter_without_progress here and it can be changed to match other estimators later. I think in general 'max'/'max_iter_no_change' makes more sense semantically but its better that the estimators are consistent and probably not worth changing.

@lucyleeow
Copy link
Member Author

lucyleeow commented Feb 23, 2024

I've amended the OP to 'closes' because I don't think BernoulliRBM needs amending to max_iter as there is no convergence criterion. So this PR would close the issue.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @lucyleeow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API consistency manifold
3 participants