-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Add plotting module with heatmaps for confusion matrix and grid search results #9173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
can you fix the flake8 issues? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very first round ;)
doc/modules/classes.rst
Outdated
:no-inherited-members: | ||
|
||
This module is experimental. Use at your own risk. | ||
Use of this module requires the matplotlib library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe state version? 1.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified to: "Use of this module requires the matplotlib library,
version 1.5 or later (preferably 2.0)."
:toctree: generated/ | ||
:template: function.rst | ||
|
||
plot_heatmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add functions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added. this was WIP in previous PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't count on users to read this message. My two cents is that if you really want users to notice that the plotting module is experimental, you'd have to put it in a sub module "experimental" or "future", and only move it to the main namespace when the API is stable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unclear to me what it would mean for the API to be stable, and I really don't like forcing people to change their code later. I would probably just remove the warning here and then do standard deprecation cycles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing deprecation cycles is forcing people to change their code eventually, with the additional risk that they won't know that this code is experimental :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deprecation cycles only sometimes require users to change their code - if they are actually using the feature you're deprecating. That's not very common for most deprecations in scikit-learn. And that is only if there is actually a change.
And I'm not sure what's experimental about this code. The experiment is more having plotting inside scikit-learn. Since it's plotting and therefor user facing, I'd rather have a warning on every call then putting it in a different module.
I guess the thing we are trying to communicate is "don't build long-term projects relying on the presence of plotting in scikit-learn because we might remove it again".
generate the heatmap. | ||
""" | ||
|
||
import matplotlib.pyplot as plt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if we want actual tests of the functionality (I'm leaning no), but I think we want at least smoke-tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added smoke-tests which just run the code without asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm looks like the config we use for coverage doesn't have matplotlib. We should change that...
sklearn/plot/_confusion_matrix.py
Outdated
if normalize: | ||
values = values.astype('float') / values.sum(axis=1)[:, np.newaxis] | ||
|
||
print(title) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print?
sklearn/plot/_gridsearch_results.py
Outdated
from sklearn.plot import plot_heatmap | ||
|
||
|
||
def plot_gridsearch_results(cv_results, param_grid, metric='mean_test_score', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works for 2d grid-searches, right? I think we should also support 1d, and error for anything else.
except ImportError: | ||
raise SkipTest("Not testing plot_heatmap, matplotlib not installed.") | ||
|
||
import matplotlib.pyplot as plt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm it looks like matplotlib is not installed for the service that computes coverage? hm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea this seems to be a problem because my tests cover much more than what Codecov is showing. i wasn't able to check coverage locally as I'm using a mac and running matplotlib from terminal is giving errors which I'm not able to resolve yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have you resolved those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope these are tricky to resolve. it requires me to re-install python using a different method and then i'll have to setup my scikit-learn development environment again. since sklearn doesn't test a lot on matplotlib, i'm just using a jupyter-notebook for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should discuss this in person.
Codecov Report
@@ Coverage Diff @@
## master #9173 +/- ##
==========================================
- Coverage 96.3% 96.13% -0.18%
==========================================
Files 332 337 +5
Lines 60549 60754 +205
==========================================
+ Hits 58314 58403 +89
- Misses 2235 2351 +116
Continue to review full report at Codecov.
|
I think we should also give the user an option to plot everything as a single line graph. This will allow them to get a good plot results in case number of unique parameters are 3 or more but the total number of cases are within say 20. Adding x-axis labels will be a challenge. But we can give each combo an ID and then print a table below showing the value of each parameter for each ID on the plot. Thoughts? |
Also, could you help me understand the circleCI error (details here): Exception occurred: ./build_tools/circle/build_doc.sh returned exit code 2 Action failed: ./build_tools/circle/build_doc.sh |
@aarshayj can you merge master? This is an issue with old sphinx, I think, which should be fixed in master. |
…cikit-learn into plot_confusion_matrix
Continues PR 8082
Changes made: