FEA Implement categorical feature support to IterativeImputer
#31479
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #31219
Our implementation automatically detects categorical columns
(based on dtype), uses a RandomForestClassifier to model and predict missing category labels, and then inverse-transforms those predictions back into the original categories. So, if given a certain X data, where X features are numerical and categorical, respectively, and there are values missing (nan) in the categorical one, _iterative.py, through the Random Forest Classifier, will predict those missing values and replace them in the data.
Co-authored-by: Fabioprata23 fabio.prata@tecnico.ulisboa.pt
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?