-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
ENH: add stats.qmc module with quasi Monte Carlo functionality #10844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note: the scope of this PR is under discussion in gh-9695. |
e246967
to
27efb1c
Compare
@rgommers First proposal is here. I still have a doc issue but I am using the same docstring as in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some initial comments. Scope discussion would be good to finish first before reviewing in more detail.
@tylerjereddy Hey I integrated your suggestions, thanks! Could you have another look? |
@Balandat I finished to implement the changes we discussed.
I like the interface and I think this can be ready soon 😃. Let me know what you think. |
This looks great. FWIW I have a local commit that moves Sobol to use the new 21201-dimensional direction numbers from https://web.maths.unsw.edu.au/~fkuo/sobol/. The c-code generated from putting the numbers in as literals is obscenely large/inefficient, so what I did was to package and save the arrays as an .npz data file and load these in upon importing the Sobol generator. This works locally with fixing the path, the outstanding thing here is to make sure this is portable and the data file gets packaged with the extension and the path is resolved appropriately. Then the tests need to be modified to check fo the new direction numbers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good, couple of comments inline. There is a doc failure that needs to be fixed.
I addressed the comments and fixed the doc (to be confirmed with tests). I plan to complete the docstrings with examples as with Halton. Speaking of the doc. Is the current doc enough or should we add a bigger section somewhere about DoE, an introduction like for optimization? Do we want figures of DoE for instance? A section in https://docs.scipy.org/doc/scipy-1.4.1/reference/tutorial/stats.html ? I have quite some material in case we need something a bit more theoretical (I have some chapters about DoE in my PhD thesis and articles I wrote, so license-wise fine). cc @josef-pkt (I think you're the PoC for this part of the doc) |
There is still a docstring warning about I found another block used in rv_continuous. Maybe more up to date: """
seed : {None, int, `~np.random.RandomState`, `~np.random.Generator`}, optional
This parameter defines the object to use for drawing random variates.
If `seed` is `None` the `~np.random.RandomState` singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used, seeded
with seed.
If `seed` is already a ``RandomState`` or ``Generator`` instance,
then that object is used.
Default is None.
""" [EDIT] fixed with this docstring |
bd0a08b
to
52cae01
Compare
@Balandat I added quite some doc trying to mimic the style of scipy. IMO, we are good for a first version and this is ready for a review for integration 😃 . |
3047b2a
to
359369c
Compare
Did one more review pass through, resolved some comments that were addressed, and added |
Merged now, thank you @tupui and @Balandat, this is a really nice addition to SciPy! And thanks also to @mdhaber and @tylerjereddy for reviewing, and @ArtOwen, @fjhickernell, @rkern, @bashtage and others for contributing their expertise. This will be a highlight of the 1.7.0 release. |
Thanks for the patience on this one all! |
@tupui could you help with the following:
|
Thanks everyone for all the good work and your guidance during this process! I learned a lot during the process and I am grateful for this. I would also like to acknowledge Sergei Kucherenko who's been of great help via email. For information, @Balandat and I will present this at the 13th International Conference on Monte Carlo Methods and Applications (MCM 2021). This for sure will help to spread the word about this new module. |
Sure I will open à PR with all that! I like the idea of the tutorial. I guess this would be another section of the existing stats tutorial. |
Yay! Don't think I've had >600 conversation items on a PR before, thanks eveyone for the thorough discussion! |
Summary: Account for the new capabilities of the pytorch SobolEngine (up to 21201 dims) from scipy/scipy#10844 Differential Revision: D25661131 fbshipit-source-id: de53b82c1a5a8d4bcba21cd72bbd1ccb0eec0164
Awesome. Yes indeed. At some point we may want to break this into separate pages for subsections, given that it is already very long. But for now I would just add to it.
I know it was a painful delivery. We only get to add whole new submodules very rarely though:) |
Summary: Performs the update that was suggested in pytorch#41489 Adjust the functionality to largely match that pf the scipy companion PR scipy/scipy#10844, including - a new `draw_base2` method - include zero as the first point in the (unscrambled) Sobol sequence The scipy PR is also quite opinionated if the `draw` method doesn't get called with a base 2 number (for which the resulting sequence has nice properties, see the scipy PR for a comprehensive discussion of this). Note that this update is a **breaking change** in the sense that sequences generated with the same parameters after as before will not be identical! They will have the same (better, arguably) distributional properties, but calling the engine with the same seed will result in different numbers in the sequence. Pull Request resolved: pytorch#49710 Test Plan: ``` from torch.quasirandom import SobolEngine sobol = SobolEngine(3) sobol.draw(4) sobol = SobolEngine(4, scramble=True) sobol.draw(5) sobol = SobolEngine(4, scramble=True) sobol.draw_base2(2) ``` Reviewed By: malfet Differential Revision: D25657233 Pulled By: Balandat fbshipit-source-id: ce8ccafe90fc968ed811634b5f37c2eb208af985
Challenge accepted, next up: A full module. |
Thank you!
Best,
Fred
Fred J. Hickernell
Vice Provost for Research, research.iit.edu <https://research.iit.edu/>
www.iit.edu/research/about/magazine <https://www.iit.edu/research/about/magazine>
Professor, Department of Applied Mathematics
Center for Interdisciplinary Scientific Computation (CISC) cos.iit.edu/cisc/ <https://cos.iit.edu/cisc/>
Illinois Institute of Technology
IIT Tower, Room 7C9-2, 10 W 35th St, Chicago, IL 60616
hickernell@iit.edu <mailto:hickernell@iit.edu>, +1 312 567 3470 <tel:+1%20312%20567%203470>
Working remotely. Please contact me by email or phone.
… On Jan 30, 2021, at 10:51 AM, Max Balandat ***@***.***> wrote:
I know it was a painful delivery. We only get to add whole new submodules very rarely though:)
Challenge accepted, next up: A full module.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#10844 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGHS6VBPX4DIUXZLPRQJBTS4Q2HLANCNFSM4IYKCGMA>.
|
@rgommers I updated the description of the issue. Added some comments in the release notes (mostly the same text). And I created the 2 follow up PR I had. I will work on the tutorial now. |
Summary: Pull Request resolved: #672 Account for the new capabilities of the pytorch SobolEngine (up to 21201 dims) from scipy/scipy#10844 Reviewed By: stevemandala Differential Revision: D25661131 fbshipit-source-id: 9cddb3470f953dbd8eab6f12c8e13d516ad8ef37
Summary: Performs the update that was suggested in #41489 Adjust the functionality to largely match that pf the scipy companion PR scipy/scipy#10844, including - a new `draw_base2` method - include zero as the first point in the (unscrambled) Sobol sequence The scipy PR is also quite opinionated if the `draw` method doesn't get called with a base 2 number (for which the resulting sequence has nice properties, see the scipy PR for a comprehensive discussion of this). Note that this update is a **breaking change** in the sense that sequences generated with the same parameters after as before will not be identical! They will have the same (better, arguably) distributional properties, but calling the engine with the same seed will result in different numbers in the sequence. Pull Request resolved: #49710 Test Plan: ``` from torch.quasirandom import SobolEngine sobol = SobolEngine(3) sobol.draw(4) sobol = SobolEngine(4, scramble=True) sobol.draw(5) sobol = SobolEngine(4, scramble=True) sobol.draw_base2(2) ``` Reviewed By: malfet Differential Revision: D25657233 Pulled By: Balandat fbshipit-source-id: 9df50a14631092b176cc692b6024aa62a639ef61
Outlined in recent discussion in SALib#414 and in scipy/scipy#10844 (comment)
Add
scipy.stats.qmc
.Reference issue
Closes #9695
What does this implement/fix?
Provide a set of functions to create and assess quasi-Monte Carlo Design of Experiments.
It provides a generic class
scipy.stats.qmc.QMCEngine
which defines a QMC engine/sampler. An engine is state aware: it can be continued, advanced and reseted. 4 base samplers are available:scipy.stats.qmc.Sobol
the well known Sobol' low discrepancy sequence. Several warnings have been added to guide the user into properly using this sampler. The sequence is scrambled by default.scipy.stats.qmc.Halton
: Halton low discrepancy sequence. The sequence is scrambled by default.scipy.stats.qmc.LatinHypercube
: plain LHS design.scipy.stats.qmc.OrthogonalLatinHypercube
: Orthogonal version of LHS which is more uniform than plain LHS.And 2 special samplers are available:
scipy.stats.qmc.MultinomialQMC
: sampling from a multinomial distribution using any of the basescipy.stats.qmc.QMCEngine
.scipy.stats.qmc.MultivariateNormalQMC
: sampling from a multivariate Normal using any of the basescipy.stats.qmc.QMCEngine
.The module also provide the following helpers:
scipy.stats.qmc.discrepancy
: assess the quality of a set of points in terms of space coverage.scipy.stats.qmc.update_discrepancy
: can be used in an optimization loop to construct a good set of points.scipy.stats.qmc.scale
: easily scale a set of points from (to) the unit interval to (from) a given range.The module quality has been assessed by specialists from the domain (see the long discussion bellow for more details). Following are some convergence plots of Sobol' and Halton.