-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] MissingIndicator transformer #8075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
amueller
merged 57 commits into
scikit-learn:master
from
maniteja123:imputer_missing_values
Jul 16, 2018
Merged
Changes from all commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
6ff7800
Initial commit for missing values indicator
maniteja123 98e28a7
Change documentation, remove axis and add simple test
maniteja123 38c58e2
Add documentation and tests
maniteja123 781d07d
Add sparse option functionality
maniteja123 ec6d69a
Modify tests
maniteja123 605d189
Add comprehensive tests
maniteja123 ca8af65
Common tests
maniteja123 07c0fce
fix astype usage
maniteja123 f02d78a
pep fixes
maniteja123 2379edb
Implement fit_transform
maniteja123 552a2cb
modify doc [ci skip]
maniteja123 0d980e4
fix failing tests
maniteja123 a1a6982
Change default to np.NaN
maniteja123 91b0122
Error when transform has features with missing values while not durin…
maniteja123 4137ed3
Doc and test changes
maniteja123 fb3d55a
Documentation changes and remove duplicate code
maniteja123 500aa65
fix tests
maniteja123 3e7c4c1
fix estimator common tests
maniteja123 426d179
fix sparse array tests
maniteja123 9b0e9be
fix sparse array tests
maniteja123 c45ad5f
fix sparse array tests
maniteja123 cc23f13
fix sparse array tests
maniteja123 f50d649
address comments and exception tests
maniteja123 70e06f7
Move MissingIndicator to impute.py
maniteja123 37f19a3
fix flake8 comments
maniteja123 313a71b
docstring changes
maniteja123 1a064ca
Merge remote-tracking branch 'origin/master' into maniteja123-imputer…
glemaitre 8c956c7
FIX add change in estimator checks
glemaitre feddbcb
FIX error during solving conflicts
glemaitre 49ef207
EHN address code reviews
glemaitre 712b2f4
PEP8
glemaitre 5c495e1
DOC address comments documentation
glemaitre 1efcd82
TST parametrize error test
glemaitre 50bc29c
reverse useless change
glemaitre e3abbc6
PEP8
glemaitre 492967c
TST parametrize test and split tests
glemaitre 7df0d14
FIX typo in tests
glemaitre 12103ad
FIX change default type to bool
glemaitre 007c6e3
EHN add a not regarding the default dtype
glemaitre a29128c
Merge branch 'master' into imputer_missing_values
jnothman 74679e6
Insert missing comma
jnothman 754c4e3
update
glemaitre b895f7c
FIX raise error with inconsistent dtype X and missing_values
glemaitre da633f5
Merge remote-tracking branch 'origin/master' into maniteja123-imputer…
glemaitre ea6f8e8
Merge remote-tracking branch 'glemaitre/is/11390' into maniteja123-im…
glemaitre 44dbc91
solve issue with NaN as string
glemaitre 34fb9a3
address jeremy comments
glemaitre 3abc695
address andy comments
glemaitre 05226fd
PEP8
glemaitre 7695551
DOC fix doc parameter
glemaitre 8c199ba
Merge remote-tracking branch 'origin/master' into maniteja123-imputer…
glemaitre 17a0caa
Merge remote-tracking branch 'glemaitre/is/11390' into maniteja123-im…
glemaitre d4ca8a8
EXA show an example using MissingIndicator
glemaitre 52d1c02
Update plot_missing_values.py
glemaitre 76558e5
DOC fix
glemaitre 51c0aa4
Merge branch 'imputer_missing_values' of github.com:maniteja123/sciki…
glemaitre 82d766d
Merge remote-tracking branch 'origin/master' into maniteja123-imputer…
glemaitre File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does number mean and why is np.nan not a number? Maybe just move the np.nan to the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
number means real number. It's just to fit this in one line.
I think by definition nan is not a number :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the dtype is also important, isn't it? I find "float or int" more natural than "number or np.nan" but I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that "float or int" is better than number, but I think it's important to keep np.nan visible since it should be a common value for
missing_values
. Maybe something likeint, float, string or None (default=np.nan) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now this is consistent with SimpleImputer and ChainedImputer in fact.