-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Closed
Labels
Description
The percentile function does not return the expected values. It seems to be getting the linear distance between two points the wrong way around and returns a value closer to the lower number.
For example:
array([ 110531353, 167471747, 167471747, 183000406,
200000000, 759174457, 921094606, 931142911,
1300000000, 1341797102, 1380317195, 1380317195,
1500000000, 1500000000, 1500000000, 1830004057,
1932444073, 2000000000, 2000000000, 2345976525, 2500000000,
2745006085, 2847019692, 3000000000, 3000000000, 3000000000,
3000000000, 3312761268, 3500000000, 3588824707, 4000000000,
4140951585, 5000000000, 6600000000, 7100000000, 7717299940,
8445515490, 8972061767, 9662220364, 11000000000, 11042537559,
12422854754, 13000000000, 13000000000, 13803171949, 15000000000,
17000000000, 20000000000, 22085075118, 24845709508, 32025070994,
34000000000, 35000000000, 36000000000, 36000000000, 39453076027,
40000000000, 46930784627, 82819031694, 110425375592], dtype=int64)
> > > np.percentile(a,75)
> > > 14102378961.75
> > > np.percentile(a,50)
> > > 3794412353.5
> > > np.percentile(a,25)
> > > 1747503042.75
expected values:
75 = 14700792987.25
50 = 3794412353.5
25 = 1582501014.25
When reading the documentation the linear methodology seems correct but there possibly could be an issue with the fraction it is using??