Skip to content

Percentile Linear returns incorrect value.  #7875

@mikep2016

Description

@mikep2016

The percentile function does not return the expected values. It seems to be getting the linear distance between two points the wrong way around and returns a value closer to the lower number.

For example:

array([   110531353,    167471747,    167471747,    183000406,
          200000000,    759174457,    921094606,    931142911,
         1300000000,   1341797102,   1380317195,   1380317195,
         1500000000,   1500000000,   1500000000,   1830004057,
         1932444073,   2000000000,   2000000000, 2345976525, 2500000000,
       2745006085, 2847019692, 3000000000, 3000000000, 3000000000,
       3000000000, 3312761268, 3500000000, 3588824707, 4000000000,
       4140951585, 5000000000, 6600000000, 7100000000, 7717299940,
       8445515490, 8972061767, 9662220364, 11000000000, 11042537559,
       12422854754, 13000000000, 13000000000, 13803171949, 15000000000,
       17000000000, 20000000000, 22085075118, 24845709508, 32025070994,
       34000000000, 35000000000, 36000000000, 36000000000, 39453076027,
       40000000000, 46930784627, 82819031694, 110425375592], dtype=int64)

> > > np.percentile(a,75)
> > > 14102378961.75
> > > np.percentile(a,50)
> > > 3794412353.5
> > > np.percentile(a,25)
> > > 1747503042.75

expected values:

75 = 14700792987.25
50 = 3794412353.5
25 = 1582501014.25

When reading the documentation the linear methodology seems correct but there possibly could be an issue with the fraction it is using??

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions