No label or value to be assigned to outliers in Radius Neighbors Classifier and Regressor #9629

webber26232 · 2017-08-26T04:30:14Z

Description

In the Radius Neighbor models, some outlier samples may not have any neighbor within a given radius. Therefore the model cannot predict their label or value. Currently, RadiusNeighborsRegressor will return NaN for outliers after calling predict(X) method. RadiusNeighborsClassifier has a parameter: outlier_label which accept an integer or None. If None is given, when any outlier is detected, a ValueError will be reaised. If an integer is given, the integer will be treated as the label for outliers.

Can we assign random labels or random values for those outliers? Currently, the only solution for outliers is removing them or seting a large radius. I don't think removing samples from test set is an ideal solution. However, assuming that there are some outliers having a extremely long distance between themselves and other samples, cover all of them, we have to dramatically increase the radius, hence the number of noise neighbors will be very large, which will badly affect prediction result.

Therefore, by setting random values or labels to outliers, we are able to find a radius to keep the inlier prediction accurate enough and influence of random outliers being limited.

I will do some experiment on it.

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

jnothman · 2017-08-27T14:03:25Z

I might note that setting outlier_label to some int not known from the dataset (e.g. -1 for true classes 0, 1, ...) allows a post-processor to then modify that label to something randomised. So it's not like the current implementation disallows that usage pattern; it merely does not provide it.

webber26232 · 2017-08-27T21:09:48Z

@jnothman Oh yes, you are right.
I just implemented predict_proba() for Radius Neighbors Classifier and I think I am able to implement this function as well.

TomDLT · 2019-08-07T17:32:49Z

Fixed in #9597

TomDLT closed this as completed Aug 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No label or value to be assigned to outliers in Radius Neighbors Classifier and Regressor #9629

No label or value to be assigned to outliers in Radius Neighbors Classifier and Regressor #9629

webber26232 commented Aug 26, 2017 •

edited

Loading

jnothman commented Aug 27, 2017 via email

webber26232 commented Aug 27, 2017

TomDLT commented Aug 7, 2019

No label or value to be assigned to outliers in Radius Neighbors Classifier and Regressor #9629

No label or value to be assigned to outliers in Radius Neighbors Classifier and Regressor #9629

Comments

webber26232 commented Aug 26, 2017 • edited Loading

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

jnothman commented Aug 27, 2017 via email

webber26232 commented Aug 27, 2017

TomDLT commented Aug 7, 2019

webber26232 commented Aug 26, 2017 •

edited

Loading