Skip to content

Conversation

rhettinger
Copy link
Contributor

Use the new math.sumprod() function to compute the weighted average.

As compared to fsum(map(mul, data, weights)) the sumprod(data, weights) code is simpler, faster, and more accurate. It is faster because we don't need a succession of calls to mul. It is more accurate because all of the intermediate products are computed losslessly rather than rounded immedately by mul. Also, we no longer need the count() wrapper for this path because sumprod() already incorporates a test to make sure the inputs are the same length.

Baseline timing

% ./python.exe -m timeit -r11 -s 'from random import expovariate as r' -s 'from statistics import fmean' -s 'n=100' -s 'data = [r() for i in range(n)]' -s 'weights = [r() for i in range(n)]' 'fmean(data, weights)'
50000 loops, best of 11: 4.96 usec per loop

Improved timing

% ./python.exe -m timeit -r11 -s 'from random import expovariate as r' -s 'from statistics import fmea
n' -s 'n=100' -s 'data = [r() for i in range(n)]' -s 'weights = [r() for i in range(n)]' 'fmean(data, weights)'
100000 loops, best of 11: 2.06 usec per loop

@rhettinger rhettinger added performance Performance or resource usage skip issue skip news 3.12 only security fixes labels Mar 12, 2023
@rhettinger rhettinger merged commit 6cd7572 into python:main Mar 12, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull request Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 only security fixes performance Performance or resource usage skip issue skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants