-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC Release Highlights for version 1.6 #30392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I'm not sure if the pipeline's transform input is sth we should write about in this release, since there's no real example out there where this is now useful. WDYT? |
Since it's one of the only 2 "major features" of this release I find it sad to not showcase it in the highlights. Can we make a toy example even if we can't benefit from it directly in sklearn but third party libraries might ? |
Ok, let me know what you think about it then. Added a non-executable piece of code. |
I have added free-threaded highlights which I pretty much copied from the changelog entry. I chose to do this rather than having a shorter description with a link to the changelog entry to save one click. Let me know if you prefer the latter option! |
I think a non-executable snippet is fine, thanks ! |
Yeah it's better since the highlights are already linked in the changelog so no need to loop around once more |
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
Could we add the newton-cholesky solver. The one PR of this release is not that big, but it completes a larger journey and we have not advertised it much. |
|
||
threshold_classifier = FixedThresholdClassifier( | ||
estimator=FrozenEstimator(classifier), threshold=0.9 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to call fit
? Maybe one way to show this is no-op, it to show some timing:
import time
from sklearn.datasets import make_classification
from sklearn.frozen import FrozenEstimator
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import FixedThresholdClassifier
X, y = make_classification(n_samples=1_000, random_state=0)
start = time.time()
classifier = SGDClassifier().fit(X, y)
print(f"Fitting the classifier took {(time.time() - start) * 1_000:.2f} milliseconds")
start = time.time()
threshold_classifier = FixedThresholdClassifier(
estimator=FrozenEstimator(classifier), threshold=0.9
).fit(X, y)
print(
f"Fitting the threshold classifier took {(time.time() - start) * 1_000:.2f} milliseconds"
)
Fitting the classifier took 2.53 milliseconds
Fitting the threshold classifier took 0.61 milliseconds
and add an extra conclusion line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only some comment regarding the grammar and two nitpicks.
I agree that it could have been highlighted back then but I'm not very comfortable putting it in the highlights of 1.6 while it was released in 1.2. The highlights should be about the new stuff that was not there previously. I think that it's not the best place to communicate more about it. Maybe @koaning would be interested in making a video about that ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass of feedback:
I would be +1 about advertising the That should not prevent anybody else to advertise it even more in blog/social media posts or videos. |
@jeremiedbb I can for sure make another video for the scikit-learn YouTube channel, but I usually prefer to start work on that once the actual release is live and tested. |
Alright, would you @ogrisel or @lorentzenchr mind writing this section ? I haven't followed that in details so you'll be a lot more precise and accurate than me :) |
Let me give it a shot. |
I pushed f0669ce to highlight the work on the new solver. I toyed a bit generating synthetic multiclass data where it would make a difference in terms of convergence to a better model but it's not easy to find the regime where it really shines so in the end I just added a paragraph with a link to the benchmark results from the PR. I checked that I can still reproduce them from the current |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
…ease-highlights-1.6
So I can't approve my own PR, but since I didn't write most of it I give my +1 anyway 😄 Is it good for you as well ? If so, please give your approval so that we can merge it and continue the release process :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
Co-authored-by: adrinjalali <adrin.jalali@gmail.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: adrinjalali <adrin.jalali@gmail.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: adrinjalali <adrin.jalali@gmail.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Candidates for the highlights
FrozenEstimator
Transform metadata in Pipeline
Missing value support in ExtraTreesClassifier/Regressor
fetch_file
@ogrisel do we want to showcase an example for that ? If so what file should we download ?
News on array api support
News on metadata routing support
Free threading support
ping @lesteve, would you mind writing this item ? You'll be more accurate than me :)
Developer API
ping @adrinjalali who kindly proposed to write something for frozen estimator, metadata in pipeline and developper API.
cc/ @scikit-learn/communication-team We plan to release 1.6.0 final this week.
cc/ @scikit-learn/core-devs Feel free to correct inaccuracies that I may have done or add items that I have missed.