-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Add pprint for estimators - continued #11705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
best if you ping after the 0.20 RC is out
… |
can you add a test with a pipeline inside a grid-search please? And the pipeline maybe has a ColumnTransformer? ;) |
because callable() returns True also for class objects (which we want to reprensent with their name as well anyway)
Don't worry about python 2 support. It will be dropped in the next release.
|
So due to some limitations of the current method, I completely changed the implementation and relied on pprint.PrettyPrinter instead. Here are some output examples:
The default is to indent after the name of the estimator, but it's expensive in line width. Unfortunately,
From the docs. Example of discrepancy:
will give
Here is what the output looks like on the same examples when indenting with 4 characters:
I personally like the first output. It might render quite long strings when estimators are nested but I don't think that it's a major issue, and it's very easy to read. ping @amueller |
Yes, I much prefer the first. Good idea to reuse PrettyPrinter. Will that be configurable for us wanting to hide default parameter settings, etc? |
Everything is encapsulated in I noticed that the |
I think it should be a list of tuples and am surprised that it is not.
|
I fixed it, here is the new look:
I'll put this as MRG for reviews, and if it gets approved I'll start updating the docs with the new formatting to fix the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've realised the change is a bit weak on documentation. Not sure where to put it in the user guide, though. Or whether print_changed_only
should be illustrated in the example gallery.
Otherwise, I'm keen to see this road tested.
doc/whats_new/v0.21.rst
Outdated
@@ -191,6 +191,10 @@ Support for Python 3.4 and below has been officially dropped. | |||
Multiple modules | |||
................ | |||
|
|||
- The `__repr__()` method of all estimators (used when calling | |||
`print(estimator)`) has been entirely re-written. :issue:`11705` by | |||
:user:`Nicolas Hug <NicolasHug>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print_changed_only
needs to be mentioned in what's new. I think it should also be mentioned somewhere in the documentation more explicitly.
doc/whats_new/v0.21.rst
Outdated
@@ -191,6 +191,10 @@ Support for Python 3.4 and below has been officially dropped. | |||
Multiple modules | |||
................ | |||
|
|||
- The `__repr__()` method of all estimators (used when calling | |||
`print(estimator)`) has been entirely re-written. :issue:`11705` by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"building on Python's pretty printing standard library."
@@ -185,12 +185,19 @@ Support for Python 3.4 and below has been officially dropped. | |||
``max_depth`` by 1 while expanding the tree if ``max_leaf_nodes`` and | |||
``max_depth`` were both specified by the user. Please note that this also | |||
affects all ensemble methods using decision trees. | |||
:pr:`12344` by :user:`Adrin Jalali <adrinjalali>`. | |||
:issue:`12344` by :user:`Adrin Jalali <adrinjalali>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrinjalali I took the liberty to fix this
I've added an example in EDIT: well, actually, I removed it. It broke the other doctests. Could have made it work be setting again |
We don't need another example of how to use config, we need somewhere in
the user guide that explicitly describes what this feature can do for the
user. An example in the gallery might be appropriate.
|
Where do I reference this example from in the doc? |
link it from the set_config docs maybe? I don't think there's a single section on that in the user guide, though... |
Isn't it automatically linked? |
to the API docs, yes. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User guide could consider adding a section on working with scikit-learn in a terminal? But not in this PR.
Note that the example is automatically referenced at the bottom of https://41116-843222-gh.circle-artifacts.com/0/doc/modules/generated/sklearn.set_config.html |
Yay! Thanks and well done @NicolasHug!!
|
Hurray! Thanks heaps!! Sorry for not giving feedback on this. Anyhow, I love it. Any plans to add back the color work, that @amueller had originally implemented? If it could be enabled by default, it would help usability. |
@GaelVaroquaux I'd love to see that, but I figured "small steps" would be better. This is already a big improvement imho. |
@GaelVaroquaux I'd love to see that, but I figured "small steps" would be
better. This is already a big improvement imho.
+1!
|
* add pprint for estimators * strip color from length, add color option * Minor cleaning, fixes, factoring and docs * Added some basic tests * Fixed line length issue * fixed flake8 and added visual test for review * Fixed test * Fixed Python 2 issues (inspect.signature import) * Trying to fix flake8 again * Added special repr for functions * Added some other visual tests * Changed _format_function in to _format_callable because callable() returns True also for class objects (which we want to reprensent with their name as well anyway) * Consistent output in Python 2 and 3 * WIP * Now using the builtin pprint module * pep8 * Added changed_only param * Fixed printing when string would fit in less than line width * Fixed printing of steps parameter * Fixed changed_only param for short estimators * fixed pep8 * Added some more description in docstring * changed_only is now an option from set_config() * Put _pprint.py into sklearn/utils, added tests * Added doctest NORMALIZE_WHITESPACE where needed * Fixed tests * fix test-doc * fixing test that passed before.... * Fixed tests * Added test for changed_only and long lines * typo * Added authors names * Added license file * Added ellipsis based on number of elements in sequence + added increasinly aggressive repr strategies * Updated whatsnew * dont use increaingly aggressive strategy * Fixed tests * Removed LICENSE file and put license text in _pprint.py * fixed test_base * Sorted parameters dictionary for consistent output in 3.5 * Actually using OrderedDict... * Addressed comments * Added test for NaN changed parameter * Update whatsnew * Added example to set_config() * Removed example * Added example in gallery * Spelling
* add pprint for estimators * strip color from length, add color option * Minor cleaning, fixes, factoring and docs * Added some basic tests * Fixed line length issue * fixed flake8 and added visual test for review * Fixed test * Fixed Python 2 issues (inspect.signature import) * Trying to fix flake8 again * Added special repr for functions * Added some other visual tests * Changed _format_function in to _format_callable because callable() returns True also for class objects (which we want to reprensent with their name as well anyway) * Consistent output in Python 2 and 3 * WIP * Now using the builtin pprint module * pep8 * Added changed_only param * Fixed printing when string would fit in less than line width * Fixed printing of steps parameter * Fixed changed_only param for short estimators * fixed pep8 * Added some more description in docstring * changed_only is now an option from set_config() * Put _pprint.py into sklearn/utils, added tests * Added doctest NORMALIZE_WHITESPACE where needed * Fixed tests * fix test-doc * fixing test that passed before.... * Fixed tests * Added test for changed_only and long lines * typo * Added authors names * Added license file * Added ellipsis based on number of elements in sequence + added increasinly aggressive repr strategies * Updated whatsnew * dont use increaingly aggressive strategy * Fixed tests * Removed LICENSE file and put license text in _pprint.py * fixed test_base * Sorted parameters dictionary for consistent output in 3.5 * Actually using OrderedDict... * Addressed comments * Added test for NaN changed parameter * Update whatsnew * Added example to set_config() * Removed example * Added example in gallery * Spelling
…)" This reverts commit 5f34303.
…)" This reverts commit 5f34303.
* add pprint for estimators * strip color from length, add color option * Minor cleaning, fixes, factoring and docs * Added some basic tests * Fixed line length issue * fixed flake8 and added visual test for review * Fixed test * Fixed Python 2 issues (inspect.signature import) * Trying to fix flake8 again * Added special repr for functions * Added some other visual tests * Changed _format_function in to _format_callable because callable() returns True also for class objects (which we want to reprensent with their name as well anyway) * Consistent output in Python 2 and 3 * WIP * Now using the builtin pprint module * pep8 * Added changed_only param * Fixed printing when string would fit in less than line width * Fixed printing of steps parameter * Fixed changed_only param for short estimators * fixed pep8 * Added some more description in docstring * changed_only is now an option from set_config() * Put _pprint.py into sklearn/utils, added tests * Added doctest NORMALIZE_WHITESPACE where needed * Fixed tests * fix test-doc * fixing test that passed before.... * Fixed tests * Added test for changed_only and long lines * typo * Added authors names * Added license file * Added ellipsis based on number of elements in sequence + added increasinly aggressive repr strategies * Updated whatsnew * dont use increaingly aggressive strategy * Fixed tests * Removed LICENSE file and put license text in _pprint.py * fixed test_base * Sorted parameters dictionary for consistent output in 3.5 * Actually using OrderedDict... * Addressed comments * Added test for NaN changed parameter * Update whatsnew * Added example to set_config() * Removed example * Added example in gallery * Spelling
Reference Issues/PRs
This is the continuation of #9099 that aims to have better printing representations for estimators
Close #9099
Close #9039
Close #7618
Fix #6323
What does this implement/fix? Explain your changes.
Fixed some line length issuesFactored code for Estimator and Pipeline into a single methodPrettyPrinter
classAny other comments?
I marked this as WIP but a first review may be useful. I tried to make the code easier to understand but it's always a bit hard with those kind of functionalities.
Line length limit is still not perfect, see e.g. the test with
RFE(...)
that goes beyond 80 chars because the printed estimator is actually an argument.Questions
_pprint.py
be inutils
instead of at the root? (test file would follow)_pprint()
method inbase.py
and use the Formatter class in__repr__
instead?