Prototypical docscraping using numpydoc #3859

lennart0901 · 2014-11-28T15:27:58Z

This is a prototype for adding numpydoc format docstring parsing for the property tables.

lennart0901 · 2014-11-28T18:17:37Z

What do you think about a primitive extraction using following regex:

"\n\s*%s\s*: (.+)" % first_param_in_signature

It would match typical docstrings for parameters in numpydoc format and also google style, if the first line only contains a short description/type

WeatherGod · 2014-11-28T18:39:48Z

Ok, this is getting insane. It is bad enough that we do the whole property
aliasing thing based off of the docstrings and we all agree that it is a
bad idea. Why do we want to expand on it? Can't we flip this around and
supply the docstring with the list of accepted forms?

On Fri, Nov 28, 2014 at 1:17 PM, Lennart Fricke notifications@github.com
wrote:

What do you think about a primitive extraction using following regex:

"\n\s_%s\s_: (.+)" % first_param_in_signature

It would match typical docstrings for parameters in numpydoc format and
also google style, if the first line only contains a short description/type

—
Reply to this email directly or view it on GitHub
#3859 (comment)
.

tacaswell · 2014-11-28T18:39:49Z

That requires enough introspection to know what the first parameter is? I suspect we are going to end up implementing/copying over a hacked down version of numpydoc which just does the Parameters section.

I wonder if this is an application for pyparsing (which we already have as a dependency) , but that might be a quagmire.

tacaswell · 2014-11-28T18:41:19Z

@WeatherGod I would not call it expanding, just trying to not break what already exists while updating docstrings to numpydoc format.

WeatherGod · 2014-11-28T18:46:53Z

sigh... yeah, I see what you mean. So, I would think that using pyparsing
would be better as we can use it to paper over the immediate problem,
buying us time to get rid of the whole doc-scraping idea, all the while we
don't add any new dependencies for what I would hope would be a temporary
fix.

On Fri, Nov 28, 2014 at 1:41 PM, Thomas A Caswell notifications@github.com
wrote:

@WeatherGod https://github.com/WeatherGod I would not call it
expanding, just trying to not break what already exists while updating
docstrings to numpydoc format.

—
Reply to this email directly or view it on GitHub
#3859 (comment)
.

lennart0901 · 2014-11-28T18:47:35Z

@tacaswell If you introspect the function the first parameter using inspect, you know the name of that one and can check for the first parameter in some rst definition style as suggested by by the regex. Parsing the parameters section does not tell you if it is in right order.

@WeatherGod Would adding the accepted values and/or alias as a method attribute be an idea?

lennart0901 · 2014-11-28T18:53:28Z

For me a remaining issue is that some setter methods accept more than one parameter.
There is no way to explain that for the tables using numpydoc format at all. See e.g. sketch_parameters

WeatherGod · 2014-11-28T18:54:25Z

method attributes? intriguing thought. I would prefer if it was the same
list everywhere and we just have a list somewhere that can be used. This is
how those property lists are added to many of the docstrings right now. But
method attributes might be good enough for the immediate term.

On Fri, Nov 28, 2014 at 1:47 PM, Lennart Fricke notifications@github.com
wrote:

@tacaswell https://github.com/tacaswell If you introspect the function
the first parameter using inspect, you know the name of that one and can
check for the first parameter in some rst definition style as suggested by
by the regex. Parsing the parameters section does not tell you if it is in
right order.

@WeatherGod https://github.com/WeatherGod Would adding the accepted
values and/or alias as a method attribute be an idea?

—
Reply to this email directly or view it on GitHub
#3859 (comment)
.

lennart0901 · 2014-11-28T18:59:50Z

Do you mean the docstring.Substitions thing? Or what is meant be property lists? I mostly know those created by ArtistInspector using docscraping

lennart0901 · 2014-11-28T19:01:07Z

As it seems using numpydoc is mostly ruled out. Shall I close the PR then or do we leave it open to track the issue.

WeatherGod · 2014-11-28T20:05:56Z

Yes, the docstring.Substitutions. The property lists (probably should have
said "tables") come from there as well. It is all just string formatting of
introspected information.

I would leave this open for the moment in case a better solution can't be
found.

On Fri, Nov 28, 2014 at 2:01 PM, Lennart Fricke notifications@github.com
wrote:

As it seems using numpydoc is mostly ruled out. Shall I close the PR then
or do we leave it open to track the issue.

—
Reply to this email directly or view it on GitHub
#3859 (comment)
.

jenshnielsen · 2014-11-29T10:58:37Z

On a related note. There is an issue with scraping of set_boxstyle in patches.py as seen in the doc builds

/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/numpydoc/docscrape.py:120: UserWarning: Unknown section Accepts:
  warn("Unknown section %s" % key)

Happens because the accepts line is:

ACCEPTS: %(AvailableBoxstyles)s

Which inserts a complete pretty printed table as far as I can see.

It also results in the ACCEPTS being missing from http://matplotlib.org/api/patches_api.html

WeatherGod · 2014-11-29T14:20:32Z

Just more evidence that doc-scraping is a bad idea.

On Sat, Nov 29, 2014 at 5:58 AM, Jens Hedegaard Nielsen <
notifications@github.com> wrote:

On a related note. There is an issue with scraping of set_boxstyle in
patches.py as seen in the doc builds

/home/travis/virtualenv/python2.7.8/lib/python2.7/site-packages/numpydoc/docscrape.py:120: UserWarning: Unknown section Accepts:
warn("Unknown section %s" % key)

Happens because the accepts line is:

ACCEPTS: %(AvailableBoxstyles)s

Which inserts a complete pretty printed table as far as I can see.

—
Reply to this email directly or view it on GitHub
#3859 (comment)
.

tacaswell · 2014-11-29T15:37:33Z

In the defense of docstring scraping, it does provide single point of information.

That said, having a dictionary of docstrings that everything refers back to is probably better path, something like

doc_string_dict = {'set_color':{'full_doc': '....', 'table_row': '...'}, ... }

we can then assign the doc-strings as needed and probably write a smarter table generator.

tacaswell · 2014-12-31T22:00:09Z

I am going to close this in favor of going the other way of hanging information off of functions/objects that is machine-readable and use that as the authoritative source to construct both the docstrings, the kwarg tables, and do validation, not the other way around.

Please don't be discouraged from contributing in the future, this is not a bad idea (we are doing something like this in my day-job), just not the right solution in this case.

Add very fault tolerant docscraping using numpydoc in ArtistInspector

e1e5fd3

tacaswell added the status: needs review label Nov 28, 2014

lennart0901 mentioned this pull request Nov 28, 2014

Allow both linestyle definition "accents" and dash-patterns as linestyle #3772

Merged

tacaswell added status: needs revision and removed status: needs review labels Nov 28, 2014

tacaswell closed this Dec 31, 2014

tacaswell removed the status: needs revision label Dec 31, 2014

Uh oh!

Prototypical docscraping using numpydoc #3859

Prototypical docscraping using numpydoc #3859

Uh oh!

Conversation

lennart0901 commented Nov 28, 2014

Uh oh!

lennart0901 commented Nov 28, 2014

Uh oh!

WeatherGod commented Nov 28, 2014

Uh oh!

tacaswell commented Nov 28, 2014

Uh oh!

tacaswell commented Nov 28, 2014

Uh oh!

WeatherGod commented Nov 28, 2014

Uh oh!

lennart0901 commented Nov 28, 2014

Uh oh!

lennart0901 commented Nov 28, 2014

Uh oh!

WeatherGod commented Nov 28, 2014

Uh oh!

lennart0901 commented Nov 28, 2014

Uh oh!

lennart0901 commented Nov 28, 2014

Uh oh!

WeatherGod commented Nov 28, 2014

Uh oh!

jenshnielsen commented Nov 29, 2014

Uh oh!

WeatherGod commented Nov 29, 2014

Uh oh!

tacaswell commented Nov 29, 2014

Uh oh!

tacaswell commented Dec 31, 2014

Uh oh!

Uh oh!