FEA confusion matrix derived metrics #17265

haochunchang · 2020-05-18T14:13:39Z

Reference Issues/PRs

Take over PR #15532
Adding Fall-out, Miss rate, specificity as metrics #5516

What does this implement/fix? Explain your changes.

Implemented a function which returns fpr, tpr, fnr, tnr.

Modify weighted average part
Add tests in test_classification.py
Add to test_common.py

Any other comments?

As this comment mentioned, tn, fp, fn, tp can also be calculated from confusion matrix.
Is this function provide a more flexible way for calculating the rates?
Any advices and help are very appreciated.
Co-authored by @ddhar1 @samskruthireddy

…, fnr)

Co-authored-by: samskruthi padigepati <https://github.com/ddhar1> Co-authored-by: Divya Dhar<https://github.com/samskruthireddy>

Co-authored-by: Divya Dhar <https://github.com/ddhar1> Co-authored-by: samskruthi padigepati <https://github.com/samskruthireddy>

Co-authored-by: samskruthi padigepati <https://github.com/samskruthireddy>

Modify doc and add deprecation to position arg.

Add test for binary classification. (Modify some lines to pass flake8)

haochunchang · 2020-05-20T03:38:47Z

I have added fpr_tpr_fnr_tnr_scores function to test_common.py but I am not sure if I put it right.
I am also not sure if the added tests in test_classification.py are enough.

Maybe @amueller @jnothman can review this when you have time :) ?

cmarmo

Thanks @haochunchang for your pull request. Sorry for the late answer. If you could find sometime to fix conflicts this will be very helpful. Thanks!

cmarmo · 2020-09-27T12:23:56Z

sklearn/metrics/_classification.py

+    labels : list, optional
+        The set of labels to include when ``average != 'binary'``, and their
+        order if ``average is None``. Labels present in the data can be
+        excluded, for example to calculate a multiclass average ignoring a
+        majority negative class, while labels not present in the data will
+        result in 0 components in a macro average. For multilabel targets,
+        labels are column indices. By default, all labels in ``y_true`` and
+        ``y_pred`` are used in sorted order.
+
+    pos_label : str or int, 1 by default
+        The class to report if ``average='binary'`` and the data is binary.
+        If the data are multiclass or multilabel, this will be ignored;
+        setting ``labels=[pos_label]`` and ``average != 'binary'`` will report
+        scores for that label only.
+
+    average : string, [None (default), 'binary', 'micro', 'macro', 'samples', \
+                       'weighted']
+        If ``None``, the scores for each class are returned. Otherwise, this
+        determines the type of averaging performed on the data:
+
+        ``'binary'``:
+            Only report results for the class specified by ``pos_label``.
+            This is applicable only if targets (``y_{true,pred}``) are binary.
+        ``'micro'``:
+            Calculate metrics globally by counting the total true positives,
+            false negatives and false positives.
+        ``'macro'``:
+            Calculate metrics for each label, and find their unweighted
+            mean.  This does not take label imbalance into account.
+        ``'weighted'``:
+            Calculate metrics for each label, and find their average weighted
+            by support (the number of true instances for each label). This
+            alters 'macro' to account for label imbalance.
+        ``'samples'``:
+            Calculate metrics for each instance, and find their average (only
+            meaningful for multilabel classification where this differs from
+            :func:`accuracy_score`).
+
+    warn_for : tuple or set, for internal use
+        This determines which warnings will be made in the case that this
+        function is being used to return only one of its metrics.
+
+    sample_weight : array-like of shape (n_samples,), default=None
+        Sample weights.
+
+    zero_division : "warn", 0 or 1, default="warn"
+        Sets the value to return when there is a zero division:
+           - tpr, fnr: when there are no positive labels
+           - fpr, tnr: when there are no negative labels
+
+        If set to "warn", this acts as 0, but warnings are also raised.
+
+    Returns
+    -------
+    tpr : float (if average is not None) or array of float, shape =\
+        [n_unique_labels]
+
+    fpr : float (if average is not None) or array of float, shape =\
+        [n_unique_labels]
+
+    tnr : float (if average is not None) or array of float, shape =\
+        [n_unique_labels]
+
+    fnr : float (if average is not None) or array of float, shape =\
+        [n_unique_labels]
+        The number of occurrences of each label in ``y_true``.


Do you mind checking scikit-learn guidelines for writing documentation and homogenize parameter and attribute descriptions? Thanks.

…into confusion-matrix-derived-metrics

cmarmo · 2020-09-29T14:21:35Z

@haochunchang I see you are online right now: do you mind fixing the description, referring to #15522. Thanks a lot! And thanks for coming back at this!

haochunchang · 2020-10-06T06:26:51Z

Hi!
I have changed some argument description, such as naming the default value and array shape.
If you have the time, please review them, thanks!

vaibhavmehrotraml

Reviewed the documentation and code, looks good to me. Since this is my first contribution I cannot say with full confidence if the documentation follows the guidelines.

vaibhavmehrotraml · 2020-12-26T15:50:05Z

sklearn/metrics/_classification.py

+                           warn_for=('tpr', 'fpr', 'tnr', 'fnr'), sample_weight=None,zero_division="warn"):
+    """Compute TPR, FPR, TNR, FNR for each class
+
+    The TPR is the ratio ``tp / (tp + fn)`` where ``tp`` is the number of


The TPR, also called sensitivity or recall, is the ratio ...

Might be more informative

vaibhavmehrotraml · 2020-12-26T15:51:21Z

sklearn/metrics/_classification.py

+    The FPR is the ratio ``fp / (tn + fp)`` where ``tn`` is the number of
+    true negatives and ``fp`` the number of false positives.
+
+    The TNR is the ratio ``tn / (tn + fp)`` where ``tn`` is the number of


The TNR, also called specificity or selectivity, is the ratio

Might be more informative

vaibhavmehrotraml · 2020-12-26T15:58:11Z

sklearn/metrics/_classification.py

+    fp_sum = MCM[:, 0, 1]
+    fn_sum = MCM[:, 1, 0]
+    tp_sum = MCM[:, 1, 1]
+    pred_sum = tp_sum + MCM[:, 0, 1]


Any specific reason to not use fp_sum instead of MCM[:, 0, 1]?

glemaitre · 2021-07-29T16:27:41Z

closing in favor of #19556

samskruthi reddy padigepati and others added 12 commits May 18, 2020 21:59

added a function with confusion matrix derived metrics (fpr, tpr, tnr…

9dbfbc8

…, fnr)

changed the true postive sum in the function

64a5a7b

add print

523eaa0

remove one print

b977216

remove print statements

5a061ef

add coauthors.

6493977

Co-authored-by: samskruthi padigepati <https://github.com/ddhar1> Co-authored-by: Divya Dhar<https://github.com/samskruthireddy>

fix doc string outputs

141fa4a

Co-authored-by: samskruthi padigepati <https://github.com/ddhar1> Co-authored-by: Divya Dhar<https://github.com/samskruthireddy>

pep8 test

9615ae8

Co-authored-by: Divya Dhar <https://github.com/ddhar1> Co-authored-by: samskruthi padigepati <https://github.com/samskruthireddy>

trivial

79e1562

Co-authored-by: samskruthi padigepati <https://github.com/samskruthireddy>

remove imported but unused flake8

8f21052

to trigger test

3ffd830

Take over PR scikit-learn#15522

fb73c6e

Modify doc and add deprecation to position arg.

github-actions bot added the module:metrics label May 18, 2020

haochunchang mentioned this pull request May 18, 2020

Confusion matrix derived metrics #15522

Closed

haochunchang added 4 commits May 18, 2020 23:31

Modify doc and zero-division in the weighted average.

c780053

Add test for binary classification. (Modify some lines to pass flake8)

Add tests for binary, multiclass and empty prediction.

408c2db

Add tpr_fpr_tnr_fnr_scores to test_common.py.

4adfe2e

Remove pred_sum variable

53d6fd2

haochunchang changed the title ~~[WIP]Confusion matrix derived metrics~~ [MRG]Confusion matrix derived metrics May 20, 2020

haochunchang mentioned this pull request May 20, 2020

Adding Fall-out, Miss rate, specificity as metrics #5516

Open

glemaitre changed the title ~~[MRG]Confusion matrix derived metrics~~ FEA confusion matrix derived metrics Sep 7, 2020

cmarmo reviewed Sep 27, 2020

View reviewed changes

haochunchang added 2 commits September 29, 2020 21:37

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

88c41af

…into confusion-matrix-derived-metrics

Fix linting

a5b5262

Fix parameter documentation

f74fc10

haochunchang force-pushed the confusion-matrix-derived-metrics branch from a4652ca to f74fc10 Compare October 5, 2020 14:36

vaibhavmehrotraml approved these changes Dec 26, 2020

View reviewed changes

cmarmo added the Waiting for Reviewer label Dec 27, 2020

glemaitre mentioned this pull request Jan 5, 2021

Confusion Matrix Representation / Return Value #19012

Open

Base automatically changed from master to main January 22, 2021 10:52

Pawel-Kranzberg mentioned this pull request Feb 25, 2021

FEA Confusion matrix derived metrics #19556

Open

glemaitre closed this Jul 29, 2021

cmarmo removed the Waiting for Reviewer label Feb 22, 2022

haochunchang deleted the confusion-matrix-derived-metrics branch June 6, 2022 15:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA confusion matrix derived metrics #17265

FEA confusion matrix derived metrics #17265

haochunchang commented May 18, 2020 •

edited

Loading

haochunchang commented May 20, 2020 •

edited

Loading

cmarmo left a comment

cmarmo Sep 27, 2020

cmarmo commented Sep 29, 2020

haochunchang commented Oct 6, 2020

vaibhavmehrotraml left a comment

vaibhavmehrotraml Dec 26, 2020

vaibhavmehrotraml Dec 26, 2020

vaibhavmehrotraml Dec 26, 2020

glemaitre commented Jul 29, 2021

FEA confusion matrix derived metrics #17265

FEA confusion matrix derived metrics #17265

Conversation

haochunchang commented May 18, 2020 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

haochunchang commented May 20, 2020 • edited Loading

cmarmo left a comment

Choose a reason for hiding this comment

cmarmo Sep 27, 2020

Choose a reason for hiding this comment

cmarmo commented Sep 29, 2020

haochunchang commented Oct 6, 2020

vaibhavmehrotraml left a comment

Choose a reason for hiding this comment

vaibhavmehrotraml Dec 26, 2020

Choose a reason for hiding this comment

vaibhavmehrotraml Dec 26, 2020

Choose a reason for hiding this comment

vaibhavmehrotraml Dec 26, 2020

Choose a reason for hiding this comment

glemaitre commented Jul 29, 2021

haochunchang commented May 18, 2020 •

edited

Loading

haochunchang commented May 20, 2020 •

edited

Loading