Skip to content

Prototypical docscraping using numpydoc #3859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

lennart0901
Copy link
Contributor

This is a prototype for adding numpydoc format docstring parsing for the property tables.

@lennart0901
Copy link
Contributor Author

What do you think about a primitive extraction using following regex:

"\n\s*%s\s*: (.+)" % first_param_in_signature

It would match typical docstrings for parameters in numpydoc format and also google style, if the first line only contains a short description/type

@WeatherGod
Copy link
Member

Ok, this is getting insane. It is bad enough that we do the whole property
aliasing thing based off of the docstrings and we all agree that it is a
bad idea. Why do we want to expand on it? Can't we flip this around and
supply the docstring with the list of accepted forms?

On Fri, Nov 28, 2014 at 1:17 PM, Lennart Fricke notifications@github.com
wrote:

What do you think about a primitive extraction using following regex:

"\n\s_%s\s_: (.+)" % first_param_in_signature

It would match typical docstrings for parameters in numpydoc format and
also google style, if the first line only contains a short description/type


Reply to this email directly or view it on GitHub
#3859 (comment)
.

@tacaswell
Copy link
Member

That requires enough introspection to know what the first parameter is? I suspect we are going to end up implementing/copying over a hacked down version of numpydoc which just does the Parameters section.

I wonder if this is an application for pyparsing (which we already have as a dependency) , but that might be a quagmire.

@tacaswell
Copy link
Member

@WeatherGod I would not call it expanding, just trying to not break what already exists while updating docstrings to numpydoc format.

@WeatherGod
Copy link
Member

sigh... yeah, I see what you mean. So, I would think that using pyparsing
would be better as we can use it to paper over the immediate problem,
buying us time to get rid of the whole doc-scraping idea, all the while we
don't add any new dependencies for what I would hope would be a temporary
fix.

On Fri, Nov 28, 2014 at 1:41 PM, Thomas A Caswell notifications@github.com
wrote:

@WeatherGod https://github.com/WeatherGod I would not call it
expanding, just trying to not break what already exists while updating
docstrings to numpydoc format.


Reply to this email directly or view it on GitHub
#3859 (comment)
.

@lennart0901
Copy link
Contributor Author

@tacaswell If you introspect the function the first parameter using inspect, you know the name of that one and can check for the first parameter in some rst definition style as suggested by by the regex. Parsing the parameters section does not tell you if it is in right order.

@WeatherGod Would adding the accepted values and/or alias as a method attribute be an idea?

@lennart0901
Copy link
Contributor Author

For me a remaining issue is that some setter methods accept more than one parameter.
There is no way to explain that for the tables using numpydoc format at all. See e.g. sketch_parameters

@WeatherGod
Copy link
Member

method attributes? intriguing thought. I would prefer if it was the same
list everywhere and we just have a list somewhere that can be used. This is
how those property lists are added to many of the docstrings right now. But
method attributes might be good enough for the immediate term.

On Fri, Nov 28, 2014 at 1:47 PM, Lennart Fricke notifications@github.com
wrote:

@tacaswell https://github.com/tacaswell If you introspect the function
the first parameter using inspect, you know the name of that one and can
check for the first parameter in some rst definition style as suggested by
by the regex. Parsing the parameters section does not tell you if it is in
right order.

@WeatherGod https://github.com/WeatherGod Would adding the accepted
values and/or alias as a method attribute be an idea?


Reply to this email directly or view it on GitHub
#3859 (comment)
.

@lennart0901
Copy link
Contributor Author

Do you mean the docstring.Substitions thing? Or what is meant be property lists? I mostly know those created by ArtistInspector using docscraping

@lennart0901
Copy link
Contributor Author

As it seems using numpydoc is mostly ruled out. Shall I close the PR then or do we leave it open to track the issue.

@WeatherGod
Copy link
Member

Yes, the docstring.Substitutions. The property lists (probably should have
said "tables") come from there as well. It is all just string formatting of
introspected information.

I would leave this open for the moment in case a better solution can't be
found.

On Fri, Nov 28, 2014 at 2:01 PM, Lennart Fricke notifications@github.com
wrote:

As it seems using numpydoc is mostly ruled out. Shall I close the PR then
or do we leave it open to track the issue.


Reply to this email directly or view it on GitHub
#3859 (comment)
.

@jenshnielsen
Copy link
Member

On a related note. There is an issue with scraping of set_boxstyle in patches.py as seen in the doc builds

/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/numpydoc/docscrape.py:120: UserWarning: Unknown section Accepts:
  warn("Unknown section %s" % key)

Happens because the accepts line is:

ACCEPTS: %(AvailableBoxstyles)s

Which inserts a complete pretty printed table as far as I can see.

It also results in the ACCEPTS being missing from http://matplotlib.org/api/patches_api.html

@WeatherGod
Copy link
Member

Just more evidence that doc-scraping is a bad idea.

On Sat, Nov 29, 2014 at 5:58 AM, Jens Hedegaard Nielsen <
notifications@github.com> wrote:

On a related note. There is an issue with scraping of set_boxstyle in
patches.py as seen in the doc builds

/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/numpydoc/docscrape.py:120: UserWarning: Unknown section Accepts:
warn("Unknown section %s" % key)

Happens because the accepts line is:

ACCEPTS: %(AvailableBoxstyles)s

Which inserts a complete pretty printed table as far as I can see.


Reply to this email directly or view it on GitHub
#3859 (comment)
.

@tacaswell
Copy link
Member

In the defense of docstring scraping, it does provide single point of information.

That said, having a dictionary of docstrings that everything refers back to is probably better path, something like

doc_string_dict = {'set_color':{'full_doc': '....', 'table_row': '...'}, ... }

we can then assign the doc-strings as needed and probably write a smarter table generator.

@tacaswell
Copy link
Member

I am going to close this in favor of going the other way of hanging information off of functions/objects that is machine-readable and use that as the authoritative source to construct both the docstrings, the kwarg tables, and do validation, not the other way around.

Please don't be discouraged from contributing in the future, this is not a bad idea (we are doing something like this in my day-job), just not the right solution in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants