-
-
Notifications
You must be signed in to change notification settings - Fork 26k
FIX use pyarrow types in pyarrow.filter() for older pyarrow versions #31605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX use pyarrow types in pyarrow.filter() for older pyarrow versions #31605
Conversation
I added The Windows |
So it seems that not all the CIs get into the coverage report for codecov. I have found in the Edit: I found the coverage is set to true in |
To sum up: it seems that the codecov errors come from Proposing solution: change min version of pyarrow to 14.0.0. Edit: I tried that (not pushed) but I am not sure anymore, since the changes I did in Edit: 🤦 Now I got it! I should have added pyarrow to the conda_dependencies part too, not just to the version restrictions. |
All passes except for the Windows CI failing which this is unrelated and coming from main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix.
import pyarrow | ||
|
||
pyarrow_version = parse_version(pyarrow.__version__) | ||
if pyarrow_version < parse_version("17.0.0"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what happens if pyarrow < 12 is installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried this.
import pyarrow | ||
|
||
if not isinstance(key, pyarrow.BooleanArray): | ||
key = pyarrow.array(key, type=pyarrow.bool_()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import pyarrow | |
if not isinstance(key, pyarrow.BooleanArray): | |
key = pyarrow.array(key, type=pyarrow.bool_()) | |
try: | |
pa = sys.modules["pyarrow"] | |
except KeyError: # pragma: no cover | |
raise ImportError("pyarrow must be installed to use pyarrow data objects") | |
if not isinstance(key, pa.BooleanArray): | |
key = pa.array(key, type=pyarrow.bool_()) |
Question for the 2. reviewer: Does it actually make a big enough difference (compared to import pyarrow
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a second reviewer, but I can reason you through what I did and why:
When we are in line import pyarrow
(currently 140) we already know that the user has pyarrow installed.
This is because this line can only be executed conditioned on that PYARROW_VERSION_BELOW_17
returns True
. The condition check one line above makes fixes.py
to be run and in it, we already have a try-except block evaluating import pyarrow
(and the version, which doesn't matter here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case you can just do pa = sys.modules["pyarrow"]
without the try except.
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing, @lorentzenchr!
I've commited your changes with the clearer TODO commands.
import pyarrow | ||
|
||
pyarrow_version = parse_version(pyarrow.__version__) | ||
if pyarrow_version < parse_version("17.0.0"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried this.
import pyarrow | ||
|
||
if not isinstance(key, pyarrow.BooleanArray): | ||
key = pyarrow.array(key, type=pyarrow.bool_()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a second reviewer, but I can reason you through what I did and why:
When we are in line import pyarrow
(currently 140) we already know that the user has pyarrow installed.
This is because this line can only be executed conditioned on that PYARROW_VERSION_BELOW_17
returns True
. The condition check one line above makes fixes.py
to be run and in it, we already have a try-except block evaluating import pyarrow
(and the version, which doesn't matter here).
Reference Issues/PRs
closes #31604
What does this implement/fix? Explain your changes.
This fixes a bug that appears for older
pyarrow
versions, when thefilter
method is used with other indexes than pyarrow types.See the issue description for more details.