-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP/RFC] Test docstring parameters (with order) #9023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
See also #7793. I think there's no doubt that something like this could improve documentation quality. One big challenge is handling exceptional cases, including deprecation and *args ( I had grand plans to make something like this more descriptive, outputting a diff between the parameters and the docstring, so that it effectively fixes things for you. I have some code somewhere. Let me know if I should pull it out and share it. (Note that diffing docstrings is only feasible in some cases, e.g. where it appears directly in the function in question and But what I'd really like to see a test for is that fitted model attributes correspond to those in the docstring. |
+1 but maybe for another PR once this or #7793 is reviewed and merged. |
@ogrisel do you think it's reasonable to fix all docstrings in this PR?
otherwise we need to do the hack of flake8
|
raise SkipTest( | ||
"numpydoc is required to test the docstrings") | ||
|
||
from numpydoc import docscrape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The codecov browser extension tells me that this line is never executed by our CI servers. It seems that we need to add the numpydoc module to at least a Python 3.6 and probably to a Python 2.7 build job in travis to properly cover this test code (and actually run the tests).
I think we can merge a first PR with the infrastructure to do the checks as long as it's sufficiently tested. Speaking of which, it would be great to have couple of unittests to check the
As @jnothman said there are probably functions that are exception to the behavioral rules encoded in your test function (e.g. |
numpydoc? anyone wanna review #7355 ;) [currently it coughs up lots of errors but should still work] |
@@ -1199,22 +1211,22 @@ def make_sparse_spd_matrix(dim=1, alpha=0.95, norm_diag=False, | |||
The probability that a coefficient is zero (see notes). Larger values | |||
enforce more sparsity. | |||
|
|||
norm_diag : boolean, optional (default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we need to respect the order exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
'__neg__', '__hash__') | ||
|
||
public_modules = [ | ||
# the list of modules users need to access for all functionality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not all? too much to do? leave for future PRS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you want a huge PR to review? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sklearn.base should be there, no? Why not use pkgutils?
] | ||
|
||
|
||
def check_parameters_match(func, doc=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This maybe should move to utils.testing? If it deserves its own tests, I don't think it should live in a test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file currently contains three things from what I can see: a function to check docstrings, tests for this function, and tests that run this function on sklearn. I feel they should live in two or three different files since they are conceptually very separate.
|
||
def test_check_parameters_match(): | ||
check_parameters_match(f_ok) | ||
assert_raise_message(RuntimeError, 'Unknown section Results', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: it would be better to have: 'Unknown section: Results'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even:
'Invalid section: Results'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a numpydoc string not mine
refactoring done |
@@ -22,7 +22,7 @@ matrix: | |||
# This environment tests that scikit-learn can be built against | |||
# versions of numpy, scipy with ATLAS that comes with Ubuntu Trusty 14.04 | |||
- env: DISTRIB="ubuntu" PYTHON_VERSION="2.7" CYTHON_VERSION="0.23.4" | |||
COVERAGE=true | |||
COVERAGE=true TEST_DOCSTRINGS="false" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to set TEST_DOCSTRINGS if you don't want to test the docstrings (similar to what we do with COVERAGE)
Update with master and fix merge conflicts
Closing this in favor of #9206 |
Reference Issue
Fixes #7758
What does this implement/fix? Explain your changes.
This adds a unit test to check that all parameters are documented
and in the right order.
Any other comments?
For now I only fixes datasets module and I think it's the least controversial.