Skip to content

API: restructure and document numpy.random C-API #14604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 19, 2019

Conversation

mattip
Copy link
Member

@mattip mattip commented Sep 27, 2019

Finish the C-API side of the new numpy.random API

@mattip
Copy link
Member Author

mattip commented Sep 27, 2019

Still a WIP. Need to decide how much of the *.pxd/*.h/*.so to expose (and commit to supporting, since exposing it means it is now publc). For instance, we currently do not build a distributions pxd file c-extension module so it cannot be used in cimport from cython, and it seems the src/aligned_malloc files do not really belong to random.

Edit: pxd -> c-extension module

@mattip
Copy link
Member Author

mattip commented Sep 27, 2019

xref gh-5312 for moving the aligned allocations into numpy itself

@mattip
Copy link
Member Author

mattip commented Oct 18, 2019

Sorry this is a bit long for a PR comment. Should it be an issue?

Line 75 says "An example ... is in the examples folder" which is part of the sources under numpy.random but not distributed in the wheel.

It contains examples of using the BitGenerator interfaces next_double and next_uint32 and the Generator from cython. To support this via cimport, we should ship a _bit_generator.pxd (which we already do) and a _generator.pxd (TBD). Can we remove the use of PyCapsule since we can directly cimport the BitGenerators?

I am not a numba expert. One of the examples is very close to the one in the exending.rst document, so I think is covered. The other in that directory seems to be a cffi example, not a numba one. It is simple enough we could put it in the extending.rst document.

TBD:

  • add a _generator.pxd file for cython
    • with or without an accompanying distributions.h file? If with then where should it live?
  • add a cffi example to extending.rst
  • change the examples to not use PyCapsule (is this possible and advisable?)

Are we answering the user stories in these examples?

One of the mailing list comments linked to the scipy.special documentation, should we try to create a page like that?

User stories - we should try to relate to these (from the mailing list thread)

Someone asked how to use random in a ufunc.

"An ideal API would allow projects like https://github.com/deepmind/torch-randomkit/tree/master/randomkit or numba to consume the code in NumPy without vendoring it."

"There are c++ applications which use boost::random, would be nice to be able to swap it for numpy.random."

"Using the existing distributions from Cython was a requested feature and an explicit goal, yes. There are users waiting for this." (mattip: but isn't this supported via Generator?)

"Numba would definitely appreciate C functions to access the random distribution implementations, and have a side-project (numba-scipy) that is making the Cython wrapped functions in SciPy visible to Numba" (mattip: I think we do this after gh-14608 from _generator.so via these cdef extern declarations, but that requires the H file.

Maybe someone who wants to write a new BitGenerator, i.e., the numpy/bitgenerator repo without needing to have the numpy code as a git submodule

Comment on lines 129 to 131
def normals_zig(Py_ssize_t n):

random_values[i] = random_gauss_zig(rng)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the _zig is no longer in the name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole block should be included via a .. literalinclude directive from the appropriate example file

@rgommers
Copy link
Member

Sorry this is a bit long for a PR comment. Should it be an issue?

Might indeed make sense to merge this quickly (there's only some doc fixes) here that improve the current state of master), and then use a new issue for the rest.

An example ... is in the examples folder

Good point. I'd suggest moving the examples into the documentation, shipping them in the tarball and including them in the rendered docs as well.

One of the mailing list comments linked to the scipy.special documentation, should we try to create a page like that?

I think so. It's no different than any other public function in the Python and C API: every function and object should be listed.

User stories - we should try to relate to these

Nice collection. Yes, those would make good examples/subsections in the user guide section on this topic.

@@ -19,7 +19,7 @@ The included BitGenerators are:
and can be advanced by an arbitrary amount. See the documentation for
:meth:`~.PCG64.advance`. PCG-64 has a period of :math:`2^{128}`. See the `PCG
author's page`_ for more details about this class of PRNG.
* MT19937 - The standard Python BitGenerator. Adds a `~mt19937.MT19937.jumped`
* MT19937 - The standard Python BitGenerator. Adds a `MT19937.jumped`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after gh-14699, I could see this bad ref that I missed in gh-14608. Not really part of this PR, but doesn't justify its own PR.

@@ -32,7 +32,7 @@ instance's methods are imported into the numpy.random namespace, see
Quick Start
-----------

By default, `~Generator` uses bits provided by `~pcg64.PCG64` which
By default, `~Generator` uses bits provided by `PCG64` which
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment elsewhere about bad refs from #14608

@mattip
Copy link
Member Author

mattip commented Oct 20, 2019

The cython code does not compile, it needs include/bitgen.h. Now we need to decide how (and if) to add these include files, or if there is a way to work around it

@mattip mattip force-pushed the random-c-api branch 4 times, most recently from 91819d8 to 766615b Compare October 23, 2019 13:26
@mattip
Copy link
Member Author

mattip commented Oct 23, 2019

Something is fishy with macos Azure runs. It is printing:

Installed /Users/runner/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/site-packages/numpy-1.18.0.dev0+20c8d7c-py3.6-macosx-10.14-x86_64.egg
Processing dependencies for numpy==1.18.0.dev0+20c8d7c
Searching for numpy==1.18.0.dev0+20c8d7c
Reading https://pypi.org/simple/numpy/
No local packages or working download links found for numpy==1.18.0.dev0+20c8d7c
Running from numpy source directory.
/Users/runner/hostedtoolcache/Python/3.6.9/x64/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'define_macros'
  warnings.warn(msg)
error: Could not find suitable distribution for Requirement.parse('numpy==1.18.0.dev0+20c8d7c')

I think this is the result of python setup.py build -j 4 build_src --verbose-cfg install, but when I run that locally on python3.6, ubuntu 18.04 it runs to completion.

Anyone with MacOS care to try to duplicate it? @tylerjereddy any ideas?

@mattip
Copy link
Member Author

mattip commented Oct 23, 2019

In 37bbe58 I added include/bitgen.h and include/distributions.h. Thoughts?

@mattip
Copy link
Member Author

mattip commented Oct 27, 2019

rebased off master to resolve conflicts after merge of #13794. Also:

  • moved public random headers to numpy/core/include/numpy/ranom
  • add first draft of random C-API public methods, rendered here

@mattip mattip changed the title WIP, API: restructure and document numpy.random C-API API: restructure and document numpy.random C-API Oct 27, 2019
@mattip
Copy link
Member Author

mattip commented Oct 27, 2019

I removed the WIP from the title. The example works, we have the new c-api documentation (link in the comment above) and examples both in documentation and in tests. Reviews would be welcome. More examples and documentation will be generated in gh-14778


.. currentmodule:: numpy.random

Typed versions of many of the `Generator` and `BitGenerator` can be accessed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. What does it mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing fragment. It should read "Typed versions of many of the methods and functions in Generator and BitGenerator can be accessed ...
Fixing

Typed versions of many of the `Generator` and `BitGenerator` can be accessed
directly from Cython: the complete list is given below.

The ``_bit_generator`` module is usable via::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "is available".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, and again below

@mattip
Copy link
Member Author

mattip commented Nov 14, 2019

Rebased to clear a merge confilict. @rkern @bashtage could you review this?

@charris charris added this to the 1.18.0 release milestone Nov 19, 2019
@charris charris merged commit d6ecf67 into numpy:master Nov 19, 2019
@charris
Copy link
Member

charris commented Nov 19, 2019

Thanks Matti. I'm not convinced that the examples are in the right place but they can probably be moved later without problems.

@charris
Copy link
Member

charris commented Nov 19, 2019

The examples might better be included somewhere in the /doc directory, the current location might be a development artifact. @rgommers Any thoughts on the examples?

@mattip
Copy link
Member Author

mattip commented Nov 19, 2019

I put the examples there so they could be tested. Pieces of them are included in the docs

@rgommers
Copy link
Member

I put the examples there so they could be tested. Pieces of them are included in the docs

That sounds like a good idea actually. Please move them to _examples though. There's a reason I added a test for unwanted public submodule API changes recently, you should never have to add to the whitelist in that test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants