-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
DOC Improve user guide on scoring parameter #30316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
:class:`model_selection.GridSearchCV`, :func:`model_selection.validation_curve` and | ||
:class:`linear_model.LogisticRegressionCV`) take a ``scoring`` parameter that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to give example outside of the model_selection
module.
Maybe @ArturoAmorQ or @StefanieSenger who looked at the related PR may be interested in taking a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these changes a lot, @lucyleeow! These add valuable information and structure this part of the Userguide better.
I found some typos, otherwise it's looking very good.
doc/modules/model_evaluation.rst
Outdated
This is not discussed on this page, but in each estimator's documentation. | ||
Most commonly this is mean :ref:`accuracy <accuracy_score>` for classifiers and the | ||
:ref:`coefficient of determination <r2_score>` (:math:`R^2`) for regressors. | ||
Details for each estimator can be found in it's documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Details for each estimator can be found in it's documentation. | |
Details for each estimator can be found in its documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "each estimator" is plural, right? So should it be "their" documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to look it up but I think it is singular: https://editorsmanual.com/articles/each-singular-or-plural/ ?
doc/modules/model_evaluation.rst
Outdated
estimators `score` method) is used. | ||
* :ref:`String name <scoring_string_names>`: common metrics can be passed via a string | ||
name. | ||
* :ref:`Callable <scoring_callable>`: more complex metrics can be passed via a callable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* :ref:`Callable <scoring_callable>`: more complex metrics can be passed via a callable | |
* :ref:`Callable <scoring_callable>`: more complex metrics or custom metrics | |
can be passed via a callable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what about:
"more complex metrics can be passed via a custom metric callable" ?
I don't think 'complex metric' and 'custom metric' have an 'or' relationship, if that makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what about:
"more complex metrics can be passed via a custom metric callable" ?
I don't think 'complex metric' and 'custom metric' have an 'or' relationship, if that makes sense?
I like your suggestion.
biclustering. | ||
|
||
functions to measure clustering performance. For more information see the | ||
:ref:`clustering_evaluation` section for instance clustering, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:ref:`clustering_evaluation` section for instance clustering, and | |
:ref:`clustering_evaluation` section for clustering, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is a typo, I think "instance clustering" is referring clustering of a single row, to differentiate it from biclustering, though I understand its a technical term and can read confusing (I think we can keep as is?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I wasn't aware it was a technical term before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @lucyleeow! Here's a first batch of comments.
doc/modules/model_evaluation.rst
Outdated
This is not discussed on this page, but in each estimator's documentation. | ||
Most commonly this is mean :ref:`accuracy <accuracy_score>` for classifiers and the | ||
:ref:`coefficient of determination <r2_score>` (:math:`R^2`) for regressors. | ||
Details for each estimator can be found in it's documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "each estimator" is plural, right? So should it be "their" documentation?
doc/modules/model_evaluation.rst
Outdated
@@ -11,13 +11,16 @@ predictions: | |||
|
|||
* **Estimator score method**: Estimators have a ``score`` method providing a | |||
default evaluation criterion for the problem they are designed to solve. | |||
This is not discussed on this page, but in each estimator's documentation. | |||
Most commonly this is mean :ref:`accuracy <accuracy_score>` for classifiers and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the feeling that "mean" here is somewhat ambiguous, does it mean the macro average over classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was stumbling over this as well.
What is meant here, I believe, is that accuracy_score()
returns the average accuracy over all samples in the test set with return float(_average(score, weights=sample_weight, normalize=normalize)).
The formulation "mean accuracy" is used inconsistently across our docs and I think this inconsistency is what makes it most confusing. And also, most people would taking the average here for granted.
What do you think, @lucyleeow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good pick up, I have also noticed that we use "mean accuracy" and "accuracy" somewhat inconsistently in our docs.
My initial thought was that we use "mean", for multi-label cases, but I realise that is not the case.
From our docs 'accuracy' is "either the fraction (default) or the count (normalize=False) of correct predictions." so I think by definition its 'over all samples in the test set'. (and if we are returning count, there is no averaging here at all).
I would vote for removing 'mean' where we are referring to the accuracy score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for removing 'mean' where we are referring to the accuracy score.
Yes, I think it's clearer. That could also be a new PR.
doc/modules/model_evaluation.rst
Outdated
>>> my_custom_loss_func(y, clf.predict(X)) | ||
0.69... | ||
>>> score(clf, X, y) | ||
-0.69... | ||
|
||
.. _diy_scoring: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By doing a grep it seems that diy_scoring
is not referenced elsewhere in the documentation. Maybe the preamble paragraph here was meant to avoid breaking the cross-reference, but can now be completely inside the dropdown, to simplify a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, are you suggesting that we move the "Custom scorer objects from scratch" section into a dropdown, and then we'd rename the above "Custom scorer objects using make_scorer
" section to just be called "Custom scorer objects" ?
Edit: nevermind, I got it after reading your next comment.
doc/modules/model_evaluation.rst
Outdated
@@ -171,59 +196,61 @@ measuring a prediction error given ground truth and prediction: | |||
the ``greater_is_better`` parameter to ``False`` (``True`` by default; see the | |||
parameter description below). | |||
|
|||
.. _scoring_make_scorer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not remove this dropdown, as this is already a really long piece of the documentation and it's somewhat easy to get lost here. What we could do to keep the reference working is adding a preamble paragraph and then hide the details in a dropdown, similarly to what was done for the "Implementing your own scoring object" section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay outline of sections is now:
* :ref:`scoring_adapt_metric` (least flexible)
* :ref:`scoring_make_scorer`
* Using `make_scorer` (more flexible)
* From scratch (most flexible)
the last 2 sub-bullets are both in their own drop down.
doc/modules/model_evaluation.rst
Outdated
Implementing your own scoring object | ||
------------------------------------ | ||
Custom scorer objects from scratch | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
You can generate even more flexible model scorers by constructing your own | ||
scoring object from scratch, without using the :func:`make_scorer` factory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that the wording "without using the :func:make_scorer
factory" is not factually correct, as the example shown in the "Using custom scorers in functions where n_jobs > 1" note does require make_scorer
(line 270 in main).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yes this is a good point. AFAICT the make_scorer
is irrelevant here, we're really just showing that we should import the scorer.
Potentially this section has just been moved to live with the diy custom scorer section with the dropdown re-shuffling?
I am going to remove the make_scorer
part as it's not relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this note should apply to both types of custom scorers, I am going to put it in it's own drop down at the end?
But happy to change, just let me know.
Thank you @StefanieSenger and @ArturoAmorQ ! I think I have address your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work, @lucyleeow!
I've read through it again in the local build: it looks and reads very nice.
biclustering. | ||
|
||
functions to measure clustering performance. For more information see the | ||
:ref:`clustering_evaluation` section for instance clustering, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I wasn't aware it was a technical term before.
doc/modules/model_evaluation.rst
Outdated
@@ -11,13 +11,16 @@ predictions: | |||
|
|||
* **Estimator score method**: Estimators have a ``score`` method providing a | |||
default evaluation criterion for the problem they are designed to solve. | |||
This is not discussed on this page, but in each estimator's documentation. | |||
Most commonly this is mean :ref:`accuracy <accuracy_score>` for classifiers and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for removing 'mean' where we are referring to the accuracy score.
Yes, I think it's clearer. That could also be a new PR.
doc/modules/model_evaluation.rst
Outdated
estimators `score` method) is used. | ||
* :ref:`String name <scoring_string_names>`: common metrics can be passed via a string | ||
name. | ||
* :ref:`Callable <scoring_callable>`: more complex metrics can be passed via a callable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm what about:
"more complex metrics can be passed via a custom metric callable" ?
I don't think 'complex metric' and 'custom metric' have an 'or' relationship, if that makes sense?
I like your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a last nitpick but otherwise LGTM :)
@@ -175,7 +175,7 @@ def cross_validate( | |||
If `scoring` represents a single score, one can use: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot comment line 173, but now instead of cross-linking to <model_evaluation> it seems more pertinent to link to <scoring_api_overview>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Done!
Thanks @lucyleeow, merging! |
Reference Issues/PRs
Follows on from #30303, while referencing to the user guide, thought it could be improved.
What does this implement/fix? Explain your changes.
scoring
parameter:" section - allows users to quickly see all options and click to the most relevantmake_scorer
section out of dropdown. I think it didn't make sense for this to be in a dropdown inside a section titled "Defining your scoring strategy from metric functions" and it is of similar 'level' to the other sections detailing callable scorersAny other comments?
Some changes opinionated, happy to change anything.