-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
API: provide examples of use of numpy random API via user stories #14778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
No. In the context of the discussion this statement was drawn from, this was referring to the functions in |
@rkern the way the code is currently structured in #14604 we do not ship |
Coming from #14951, I think the C functions represent are a lot of the value add of a low-level interface. While some are pretty trival, many are not. I don't think it adds much to maintenance since these functions are all used by Did you change the export of the distributions in #14604? I couldn't figure out where I might find them. |
I removed unused functions and renamed some others more consistently (mainly
The functions are exported from the _generator c-extension shared object, so the issue is how best to access them from C/Cython/CFFI. xref #14954 (the first commit there shows there is a problem on windows) |
The Windows .pyd files only export a single symbol, So that leaves Cython. I've always compiled the C source directly into Cython modules. Is there a reasonable alternative? Most examples that I found searching for Cython dll assume that you have the source of the DLL code, and don't actually rely on the DLL when the Cython extension module is built. |
The numpy code is only a click away. This has the added advantage of compiling the code directly, rather than calling into a DLL which not only adds runtime overhead but allows for compile-time optimizations. |
We need to expose the functions in the DLL/SO so that numba can use them. They have made an unsupported hack into the |
The functions in |
I'm only reacting to @bashtage's comment: "The Windows .pyd files only export a single symbol, PyInit_generator. If this is fixable, then the numba examples will work well." |
The export on windows was a bug that is fixed in gh-14954 |
I missed that change in the dll export. Seems like a good solution.
…On Fri, Nov 22, 2019, 19:12 Matti Picus ***@***.***> wrote:
The export on windows was a bug that is fixed in gh-14954
<#14954>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#14778?email_source=notifications&email_token=ABKTSRKMOSHSLYMUNKBLRM3QVAVLDA5CNFSM4JE6F322YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE6SYPI#issuecomment-557657149>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKTSROCTKNNDURUK7GWAFTQVAVLDANCNFSM4JE6F32Q>
.
|
Since the last open PR was merged, this is probably the best issue for questions/comments. So here are some:
|
Next raw is the next number using the natural number of bits the bg uses. It can be 32, 53, 63 or 64 in the range of generators I've considered. It is used to directly test the correctness against a reference implementation. |
Most distributions don't have _standard versions. For example, there isn't a standard lognormal, or a standard chisquare, etc. Some of the standard functions are misnomer imo. For example, standard_t should really be just t, or perhaps students t. It isn't actually standard in any sense that is different from the definition of a t. Ditto for standard cauchy. |
Cffi is pretty easy to use if you want a since function. The examples extending to numba are pretty simple. Mattis example is pretty complete which requires parsing the header file and expanding or replacing macros. |
Yes, that's what I was commenting on. Whether something "is standard" is in the eye of the beholder I'd say. This is basically a default
Now would be the time to rename them. Once we ship 1.18.0 it becomes a lot harder. |
I'm not sure I understand this comment. At the moment there's ~50 lines of boilerplate. Why couldn't we simply put that in a function so not every user has to copy-paste that every time she wants to use |
Most of the I think I named |
And I have no explanation for |
Right indeed. For the names that match Python it is what it is, I missed that. 1.18.0 only matter for the new names of functions that don't have a Python equivalent. |
For Cython, we could somehow extend the |
The top half of that document is Cython-specific, the bottom half builds on it to document the available C functions. Perhaps it could be divided into two pages |
We should decide which of the points @rgommers raised are 1.18 blockers and break them out into separate issues to be resolved in the next couple of days. |
This is standard stuff for CFFI. All the C wrapper libraries have a boilerplate problem: ctypes must define |
Should the C functions correct this? They could become random_student_t (or students_t) and random_cauchy. |
Sorry, on the phone. It should have said: CFFI is pretty easy if you only want to use one or a few functions. See which is no longer in master (and isn't quite right, but easy to fix). There is no need for tons of boiler plate. TBH, I'm not sure why @mattip got rid of this simple example which is basic but follows the style pretty much every CFFI example on the internet. Of course, NumPy could provide a components needed for CFFI interfacing, but this seems to me like committing support something that may not be worth the overhead since downstream might want to do things differently. |
As for 1.18, I think only naming if essential, since changing it later is harder. I don't think it really is essential to rename some of the functions, but my vote would be standard_t -> student_t The one feature that I would really like, and I think is needed to complete the story, is to have something like npymath.lib for the C functions, perhaps npyrandom.lib. This would allow C/Cython to link to these functions without needing to get the C source form somewhere. For my, this would allow what will be a slimmed-down version of |
Must be a glitch in the way you are looking at the repo. The file is there, it is referenced in the docs but not tested since it is complicated to run.
The example I provided shows how to use all the functions available in
Then we would add |
nvermind, that is wrong. Changing the names as suggested in gh-15007 |
The Generator.standard_exponential method has a |
Done in #15007 (pending review and passing tests) |
The reference page: Cython API for random implies that it is possible to use a long list of functions "random_..." from Cython and points to Extending for examples of using these functions. However those examples only use the methods associated to bitgen_t. I am unable to cimport any of those functions.
Perhaps "_generator.pxy" is required but is not present. I am able to generate random numbers from bitgen_t.next_double, following the Extending example above. Would it be possible to clarify the documentation by including an example? Or have I misunderstood and those functions should be private, in which case could this be clarified? Numpy/Python version information:1.18.1 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] |
It seems the cython use case was indeed overlooked. There is an example using CFFI, but none for cython. |
…API (gh-15463) xref gh-14778 As pointed out in the comment by @jamesthomasgriffin, we did not include a pxd file to expose the distribution functions documented in the random c-api. This PR adds a c_distributions.pxd file that exposes them. Squashed commits: * BUG: add missing c_distributions.pxd to enable cython use of random C-API * ENH, TST: add npyrandom library like npymath, test cython use of it * BUG: actually prefix f-string with f * MAINT: fixes from review, add _bit_generato_bit_generator.pxd * STY: fixes from review * BLD: don't use nprandom library for mtrand legacy build * TST: WindowsPath cannot be used in subprocess's list2cmdline * MAINT, API: move _bit_generator to bit_generator * DOC: add release note about moving bit_generator * DOC, MAINT: fixes from review * MAINT: redo dtype determination from review
Gathered from gh-14604, gh-14517 and the discussions.
"To summarize I need to draw random ints of a given C type from continually changing ranges, either one-by-one or small batch-by-small batch."
Someone asked how to use random in a ufunc.
"An ideal API would allow projects like https://github.com/deepmind/torch-randomkit/tree/master/randomkit or numba to consume the code in NumPy without vendoring it."
"There are c++ applications which use boost::random, would be nice to be able to swap it for numpy.random."
"Using the existing distributions from Cython was a requested feature and an explicit goal, yes. There are users waiting for this." (mattip: but isn't this supported via Generator?)
"Numba would definitely appreciate C functions to access the random distribution implementations, and
have a side-project (numba-scipy) that is making the Cython wrapped functions in SciPy visible to Numba" (mattip: I think we do this after API: rearrange the cython files in numpy.random #14608 from _generator.so via these cdef extern declarations, but that requires the H file.
Someone who wants to write a new BitGenerator, i.e., the numpy/bitgenerator repo without needing to have the numpy code as a git submodule
Some of these are already be handled by gh-14604, but I am putting them here for completeness
Edit: turned into a checklist
The text was updated successfully, but these errors were encountered: