Skip to content

FEA confusion matrix derived metrics #17265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

haochunchang
Copy link
Contributor

@haochunchang haochunchang commented May 18, 2020

Reference Issues/PRs

Take over PR #15532
Adding Fall-out, Miss rate, specificity as metrics #5516

What does this implement/fix? Explain your changes.

Implemented a function which returns fpr, tpr, fnr, tnr.

  • Modify weighted average part
  • Add tests in test_classification.py
  • Add to test_common.py

Any other comments?

As this comment mentioned, tn, fp, fn, tp can also be calculated from confusion matrix.
Is this function provide a more flexible way for calculating the rates?
Any advices and help are very appreciated.
Co-authored by @ddhar1 @samskruthireddy

@haochunchang
Copy link
Contributor Author

haochunchang commented May 20, 2020

I have added fpr_tpr_fnr_tnr_scores function to test_common.py but I am not sure if I put it right.
I am also not sure if the added tests in test_classification.py are enough.

Maybe @amueller @jnothman can review this when you have time :) ?

@haochunchang haochunchang changed the title [WIP]Confusion matrix derived metrics [MRG]Confusion matrix derived metrics May 20, 2020
@glemaitre glemaitre changed the title [MRG]Confusion matrix derived metrics FEA confusion matrix derived metrics Sep 7, 2020
Copy link
Contributor

@cmarmo cmarmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @haochunchang for your pull request. Sorry for the late answer. If you could find sometime to fix conflicts this will be very helpful. Thanks!

Comment on lines 1540 to 1605
labels : list, optional
The set of labels to include when ``average != 'binary'``, and their
order if ``average is None``. Labels present in the data can be
excluded, for example to calculate a multiclass average ignoring a
majority negative class, while labels not present in the data will
result in 0 components in a macro average. For multilabel targets,
labels are column indices. By default, all labels in ``y_true`` and
``y_pred`` are used in sorted order.

pos_label : str or int, 1 by default
The class to report if ``average='binary'`` and the data is binary.
If the data are multiclass or multilabel, this will be ignored;
setting ``labels=[pos_label]`` and ``average != 'binary'`` will report
scores for that label only.

average : string, [None (default), 'binary', 'micro', 'macro', 'samples', \
'weighted']
If ``None``, the scores for each class are returned. Otherwise, this
determines the type of averaging performed on the data:

``'binary'``:
Only report results for the class specified by ``pos_label``.
This is applicable only if targets (``y_{true,pred}``) are binary.
``'micro'``:
Calculate metrics globally by counting the total true positives,
false negatives and false positives.
``'macro'``:
Calculate metrics for each label, and find their unweighted
mean. This does not take label imbalance into account.
``'weighted'``:
Calculate metrics for each label, and find their average weighted
by support (the number of true instances for each label). This
alters 'macro' to account for label imbalance.
``'samples'``:
Calculate metrics for each instance, and find their average (only
meaningful for multilabel classification where this differs from
:func:`accuracy_score`).

warn_for : tuple or set, for internal use
This determines which warnings will be made in the case that this
function is being used to return only one of its metrics.

sample_weight : array-like of shape (n_samples,), default=None
Sample weights.

zero_division : "warn", 0 or 1, default="warn"
Sets the value to return when there is a zero division:
- tpr, fnr: when there are no positive labels
- fpr, tnr: when there are no negative labels

If set to "warn", this acts as 0, but warnings are also raised.

Returns
-------
tpr : float (if average is not None) or array of float, shape =\
[n_unique_labels]

fpr : float (if average is not None) or array of float, shape =\
[n_unique_labels]

tnr : float (if average is not None) or array of float, shape =\
[n_unique_labels]

fnr : float (if average is not None) or array of float, shape =\
[n_unique_labels]
The number of occurrences of each label in ``y_true``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind checking scikit-learn guidelines for writing documentation and homogenize parameter and attribute descriptions? Thanks.

@cmarmo
Copy link
Contributor

cmarmo commented Sep 29, 2020

@haochunchang I see you are online right now: do you mind fixing the description, referring to #15522. Thanks a lot! And thanks for coming back at this!

@haochunchang haochunchang force-pushed the confusion-matrix-derived-metrics branch from a4652ca to f74fc10 Compare October 5, 2020 14:36
@haochunchang
Copy link
Contributor Author

Hi!
I have changed some argument description, such as naming the default value and array shape.
If you have the time, please review them, thanks!

Copy link

@vaibhavmehrotraml vaibhavmehrotraml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the documentation and code, looks good to me. Since this is my first contribution I cannot say with full confidence if the documentation follows the guidelines.

warn_for=('tpr', 'fpr', 'tnr', 'fnr'), sample_weight=None,zero_division="warn"):
"""Compute TPR, FPR, TNR, FNR for each class

The TPR is the ratio ``tp / (tp + fn)`` where ``tp`` is the number of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TPR, also called sensitivity or recall, is the ratio ...

Might be more informative

The FPR is the ratio ``fp / (tn + fp)`` where ``tn`` is the number of
true negatives and ``fp`` the number of false positives.

The TNR is the ratio ``tn / (tn + fp)`` where ``tn`` is the number of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TNR, also called specificity or selectivity, is the ratio

Might be more informative

fp_sum = MCM[:, 0, 1]
fn_sum = MCM[:, 1, 0]
tp_sum = MCM[:, 1, 1]
pred_sum = tp_sum + MCM[:, 0, 1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to not use fp_sum instead of MCM[:, 0, 1]?

@glemaitre
Copy link
Member

closing in favor of #19556

@glemaitre glemaitre closed this Jul 29, 2021
@haochunchang haochunchang deleted the confusion-matrix-derived-metrics branch June 6, 2022 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants