You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happens is the mean is computed on each axis in turn (mean of mean). When no nans are involved, we get theoretically the same result. In practice, we loose some precision but it was deemed acceptable so far. However, when nans are involved, the result is significantly wrong.
As a workaround until larray 0.35 is released, I have recommended using:
>>># TODO: do not use this function anymore when larray 0.35 will be available
... defnd_mean(array, axes_or_groups):
... """... Computes the mean of array over axes_or_groups.... ... This function is temporarily necessary because larray versions up to (and including) 0.34.x... behave badly when computing the means on groups over several dimensions ... when some values are nans. See https://github.com/larray-project/larray/issues/1118... """
... returnarray.sum(*axes_or_groups) / (~isnan(array)).sum(*axes_or_groups)
>>>nd_mean(arr, ("a0,a1 >> a01", "b0,b1 >> b01"))
2.6666666666666665
The text was updated successfully, but these errors were encountered:
Here is a version which does not output a warning on all-nan slices. I am unsure it is a good idea to do this by default though, so I added a "warn_all_nan_slices" argument, defaulting to True. I fear our users will always want it to be False, but then will miss helpful warnings when there is an actual problem with their data, so I am unsure if having the argument makes sense. What I know is that having the argument default to False is not worth it because nobody would use it, unless they were already been bitten by a bad case of this.
# TODO: do not use this function anymore when larray 0.35 will be availabledefnd_mean(array, axes_or_groups, warn_all_nan_slices=True):
""" Computes the mean of array over axes_or_groups. This function is temporarily necessary because larray versions up to (and including) 0.34.x behave badly when computing the means on groups over several dimensions when some values are nans. See https://github.com/larray-project/larray/issues/1118 """value_sums=array.sum(*axes_or_groups)
counts= (~isnan(array)).sum(*axes_or_groups)
ifwarn_all_nan_slices:
returnvalue_sums/countselse:
returnwhere(counts>0, value_sums.divnot0(counts), nan)
What happens is the mean is computed on each axis in turn (mean of mean). When no nans are involved, we get theoretically the same result. In practice, we loose some precision but it was deemed acceptable so far. However, when nans are involved, the result is significantly wrong.
While this should be 2.6666... What happens is that it computes:
Instead of:
As a workaround until larray 0.35 is released, I have recommended using:
The text was updated successfully, but these errors were encountered: