Skip to content

Conversation

whyvineet
Copy link
Contributor

Copy link

@Alvaro-Kothe Alvaro-Kothe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solves the boolean case, but it will have the same problems for the NA, nan and NaT and perhaps other edge cases.

I think that its best for one of the pandas core members decide on how non-string level and Index.name should be handled.

@@ -2084,7 +2084,7 @@ def _validate_index_level(self, level) -> None:
verification must be done like in MultiIndex.

"""
if isinstance(level, int):
if type(level) is int:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a simple edge case where the Index.name is also an int

Suggested change
if type(level) is int:
if type(level) is int:
if isinstance(self.name, int) and level == self.name:
return

raise KeyError(
f"Requested level ({level}) does not match index name ({self.name})"
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is a better verification that validates what is expected from the documentation.

Suggested change
elif isinstance(level, str) and isinstance(self.name, str) and level != self.name:
raise KeyError(
f"Requested level ({level}) does not match index name ({self.name})"
)

@@ -2084,7 +2084,10 @@ def _validate_index_level(self, level) -> None:
verification must be done like in MultiIndex.

"""
if isinstance(level, int):
if type(level) is int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lib.is_integer

@whyvineet
Copy link
Contributor Author

@jbrockmendel Can you explain why these checks are constantly failing? Am I overlooking something?

@@ -16,6 +17,7 @@
import pandas._libs.pandas_parser # isort: skip # type: ignore[reportUnusedImport]
import pandas._libs.pandas_datetime # noqa: F401 # isort: skip # type: ignore[reportUnusedImport]
from pandas._libs.interval import Interval
from pandas._libs.missing import NA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this change is unrelated to the rest of the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, actually, I was trying something and forgot to undo this change. It's fixed in the latest commit.

if isinstance(level, int):
# Explicitly raise for missing/null values to match pandas convention
# Also reject all NA-like values (np.nan, pd.NA, pd.NaT, etc.)
if isna(level):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if index.name is a na-like scalar?

)

# Reject booleans unless the index name is actually a boolean and matches
if isinstance(level, bool):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this handled in the existing elif level != self.name check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense... I'll keep the same.

@jbrockmendel
Copy link
Member

@jbrockmendel Can you explain why these checks are constantly failing? Am I overlooking something?

It's really common to have level=name=None, which is getting caught by the isna(level) check you added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Index.get_level_values() does not handle boolean, pd.NA, np.nan, or pd.NaT level correctly
3 participants