Skip to content

Commit 042aa88

Browse files
authored
gh-108322: Optimize statistics.NormalDist.samples() (gh-108324)
1 parent 09343db commit 042aa88

File tree

3 files changed

+14
-5
lines changed

3 files changed

+14
-5
lines changed

Doc/library/statistics.rst

+5
Original file line numberDiff line numberDiff line change
@@ -828,6 +828,11 @@ of applications in statistics.
828828
number generator. This is useful for creating reproducible results,
829829
even in a multi-threading context.
830830

831+
.. versionchanged:: 3.13
832+
833+
Switched to a faster algorithm. To reproduce samples from previous
834+
versions, use :func:`random.seed` and :func:`random.gauss`.
835+
831836
.. method:: NormalDist.pdf(x)
832837

833838
Using a `probability density function (pdf)

Lib/statistics.py

+7-5
Original file line numberDiff line numberDiff line change
@@ -1135,7 +1135,7 @@ def linear_regression(x, y, /, *, proportional=False):
11351135
>>> noise = NormalDist().samples(5, seed=42)
11361136
>>> y = [3 * x[i] + 2 + noise[i] for i in range(5)]
11371137
>>> linear_regression(x, y) #doctest: +ELLIPSIS
1138-
LinearRegression(slope=3.09078914170..., intercept=1.75684970486...)
1138+
LinearRegression(slope=3.17495..., intercept=1.00925...)
11391139
11401140
If *proportional* is true, the independent variable *x* and the
11411141
dependent variable *y* are assumed to be directly proportional.
@@ -1148,7 +1148,7 @@ def linear_regression(x, y, /, *, proportional=False):
11481148
11491149
>>> y = [3 * x[i] + noise[i] for i in range(5)]
11501150
>>> linear_regression(x, y, proportional=True) #doctest: +ELLIPSIS
1151-
LinearRegression(slope=3.02447542484..., intercept=0.0)
1151+
LinearRegression(slope=2.90475..., intercept=0.0)
11521152
11531153
"""
11541154
n = len(x)
@@ -1279,9 +1279,11 @@ def from_samples(cls, data):
12791279

12801280
def samples(self, n, *, seed=None):
12811281
"Generate *n* samples for a given mean and standard deviation."
1282-
gauss = random.gauss if seed is None else random.Random(seed).gauss
1283-
mu, sigma = self._mu, self._sigma
1284-
return [gauss(mu, sigma) for _ in repeat(None, n)]
1282+
rnd = random.random if seed is None else random.Random(seed).random
1283+
inv_cdf = _normal_dist_inv_cdf
1284+
mu = self._mu
1285+
sigma = self._sigma
1286+
return [inv_cdf(rnd(), mu, sigma) for _ in repeat(None, n)]
12851287

12861288
def pdf(self, x):
12871289
"Probability density function. P(x <= X < x+dx) / dx"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
Speed-up NormalDist.samples() by using the inverse CDF method instead of
2+
calling random.gauss().

0 commit comments

Comments
 (0)