-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
MNT Ignore pandas 2.2 warnings including Pyarrow dependency warnings #28305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Ignore pandas 2.2 warnings including Pyarrow dependency warnings #28305
Conversation
This is spreading warning ignore code even more in different places. I like that in #28258 they're all moving to the same place. |
Yes completely agreed on yet another place for handling warnings, I was editing my message at the same time 😉. IMO we spent already way too much time for something that should have been easy (ignoring a warning). I am kind of hoping that the pandas situation improves in the short-term, either they remove the newline in the warning which makes it easy to ignore with |
Yes, but I really don't want to add more tech-debt here for warnings. |
wanna push to my branch instead? we're doing the same thing it seems 😁 and I'm -1 on ignoring the warning this way. |
Personally I would avoid running locally with warnings turned into errors. It could lead to contributors (including us) finding some combination of packages versions that create warnings and us to chase not so important things (as we are doing right now IMO 😉) ... I would prioritise unlocking the situation to get to up-to-date lock-files with a green CI and that issues can be solved separately and things can be cleaned-up soonish
|
Everytime we do a "quick fix" on the CI, next time it takes longer, and the bus factor goes lower. Our CI is already quite complicated, and I'm trying to make it a bit easier for all of us to understand it and replicate it locally. |
I don't know about this, I am cleaning lock-files regularly for temporary work-arounds, maybe I am the only one that cares but I still do it 😉 ... |
Yeah, I agree that this is definitely something we want! |
In the meantime, CI is green 🎉. I'd try to look at #28258 to see what can be back-ported, it seems at least some tweaks in examples to avoid pandas 2.2 warnings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on this as a permanent solution, but happy to merge this, and try to fix #28258 tomorrow and merge that one. And I don't think the pandas issue is going away soon.
@@ -80,7 +80,10 @@ | |||
|
|||
docstring_test_dependencies = ["sphinx", "numpydoc"] | |||
|
|||
default_package_constraints = {} | |||
default_package_constraints = { | |||
# XXX: Temporary work-around for pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to an issue for what's happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added with reference to Pytest PR. IMO this change makes sense, we need to tweak slightly our tests to take it into account.
X["weather"] = ( | ||
X["weather"] | ||
.astype(object) | ||
.replace(to_replace="heavy_rain", value="rain") | ||
.astype("category") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or
X["weather"] = ( | |
X["weather"] | |
.astype(object) | |
.replace(to_replace="heavy_rain", value="rain") | |
.astype("category") | |
) | |
X.loc[X["weather"] == "heavy_rain", "col"] = "rain" | |
X["weather"] = X["weather"].astype("category").cat.remove_unused_categories() |
but don't know which one's better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like mine, but I haven't found anything better after looking for 20 minutes on how to merge two categorical values ...
I find the flow is more readable somehow than your suggestion. I agree it is not super clear why you need .astype(object)
and then .astype('category')
. If you want to know: avoid yet another pandas warning, replace
is not supposed to change categories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I found this which maybe is a bit simpler?
X["weather"] = X["weather"].cat.remove_categories("heavy_rain").fillna("rain")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are already missing values, this removes them though. So I don't think we'd want to encourage the pattern in our examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm good point actually ... I reverted this commit and went with my first proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reverts commit 55932b5.
FYI, I have some local fixes for Pytest 8, and I will open a PR with them and the lock-file update, when this PR gets merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge this, and then we need to figure out why the debian run on my PR fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with this version because it is green.
Thanks @lesteve |
…cikit-learn#28305) Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
…cikit-learn#28305) Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
…28305) Co-authored-by: adrinjalali <adrin.jalali@gmail.com>
Maybe slightly simpler alternative to #28258. I think I found a way to ignore the warning in
sklearn/conftest.py
. This is a bit hacky (yet another place for warning handling on top ofsetup.cfg
and CItest_script.sh
), but this may be a reasonable trade-off in the short to medium term until the pandas Pyarrow dependency is a bit more clear.