Skip to content

DOC Added references to the Matthews correlation coefficient function in the user guide #19958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 6, 2024

Conversation

davidechicco
Copy link
Contributor

Reference Issues/PRs

What does this implement/fix? Explain your changes.

I am proposing to include the reference to two articles of mine about the Matthews correlation coefficient because they describessome mathematical properties of MCC, some mathematical relationships between MCC and other measures (F1 score and accuracy; and balanced accuracy, bookmaker informedness, and markedness), and some use cases about how it works and what message it generates.
I think this information might be useful for the scikit-learn users who would like to know more about MCC.
I already proposed to include one of these two references in the MCC code function, but other users suggested me to add it to the user guide instead.

I am available to provide additional information, of course.

Thanks for considering my pull request.

Any other comments?

@davidechicco davidechicco changed the title DOC Added references to the Matthews correlation coefficient function DOC Added references to the Matthews correlation coefficient function in the user guide Apr 22, 2021
@glemaitre
Copy link
Member

I was under the impression that we add references in our user guide to back-up claims or explanations. While I am not against adding references, I think that it should be more motivated by adding insightful discussions and not just a drop-off of a reference section with no direct links with the discussion.

I feel that sometimes our user guide is a kind of a metrics catalogue and might insights such as "which metrics to choose?", "what are the links between metrics?", etc. I think that references could come together with such explanations.

I see that @thomasjpfan and @jnothman were inclined to such addition so I was wondering about their thoughts.

@thomasjpfan
Copy link
Member

Most of these references are from the docstring:

References
----------

Now that I look at the git blame, [1] is for original implementation, [2] is wikipedia, and [3] and [4] are for the multiclass implementation. In this contenxt, I do not think the new reference "The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy.. " should be included here. Maybe the best place to put this reference is a new example that compares MCC with other metrics.

@davidechicco
Copy link
Contributor Author

Hi @glemaitre and @thomasjpfan
Thanks for considering my request. The reason why I am suggesting the inclusion of these two articles as references in the user guide is not their suggestion to use the MCC instead of the other cited rates (that would be out of scope). Instead, I am suggesting to include them because they contain some detailed descriptions of the mathematical properties of the Matthews correlation coefficient (in the Mathematical background sections of the two articles) that are missing in the 3 articles already cited and in the Wikipedia reference.

I think a user who wants to know more about the mathematical foundations of the MCC would be interested in reading them. That's why I suggest their inclusion.

Best,

-- Davide

@davidechicco
Copy link
Contributor Author

Hi
Is there any news regarding this request of mine? Thank you

-- Davide

@cmarmo cmarmo added the Needs Decision Requires decision label Feb 14, 2022
Copy link

github-actions bot commented Apr 15, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c59452d. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that last reference would need a separate example/section to be justified. Otheriwise LGTM. I've removed that last one.

WDYT @glemaitre

@glemaitre
Copy link
Member

For me, it would be enough to reference the wikipedia page that itself reference the other paper. The additional sentence does not justify why we add 4 references.

@glemaitre glemaitre closed this May 18, 2024
@glemaitre glemaitre reopened this May 18, 2024
@glemaitre
Copy link
Member

Whoops, I did not mean to close the PR.

@adrinjalali
Copy link
Member

For me, it would be enough to reference the wikipedia page that itself reference the other paper. The additional sentence does not justify why we add 4 references.

I don't mind. Removed the extra ones. I guess you can merge this one now @glemaitre

@@ -1277,6 +1278,13 @@ function:
>>> matthews_corrcoef(y_true, y_pred)
-0.33...

.. topic:: References:

.. [WikipediaMCC2021] Wikipedia contributors. Detection error tradeoff.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm there is something wrong here since it specify the DET instead of the MCC?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Changed that, and using the Phi coefficient page directly since MCC is redirected here.

@glemaitre glemaitre merged commit 9a19995 into scikit-learn:main Jun 6, 2024
30 checks passed
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jul 2, 2024
… in the user guide (scikit-learn#19958)

Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
@jeremiedbb jeremiedbb mentioned this pull request Jul 2, 2024
11 tasks
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jul 2, 2024
… in the user guide (scikit-learn#19958)

Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
jeremiedbb pushed a commit that referenced this pull request Jul 2, 2024
… in the user guide (#19958)

Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants