Skip to content

Alignment of n-dimensional indexes with partially excluded dims #10293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

benbovy
Copy link
Member

@benbovy benbovy commented May 7, 2025

Support checking exact alignment of indexes that use multiple dimensions in the cases where some of those dimensions are included in alignment while others are excluded.

Add the exclude_dims keyword argument to Index.equals() (still support old signature with FutureWarning).

Also fixed a bug: indexes associated with scalar coordinates were ignored during alignment.

TODO:

  • add exclude_dims kwarg to CoordinateTransform.equals() as well
  • for join='exact' copy or assign the index of each original object in the new aligned objects instead of assigning the aligned index (when copying this may have an impact on performance).

Support checking exact alignment of indexes that use multiple dimensions
in the cases where some of those dimensions are included in alignment
while others are excluded.

Added `exclude_dims` keyword argument to `Index.equals()` (and still
support old signature with future warning).

Also fixed bug: indexes associated with scalar coordinates were ignored
during alignment.

Added tests as well.
Comment on lines +352 to +360
@overload
def equals(self, other: Index) -> bool: ...

@overload
def equals(
self, other: Index, *, exclude_dims: frozenset[Hashable] | None = None
) -> bool: ...

def equals(self, other: Index, **kwargs) -> bool:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added those overloads so Mypy doesn't fail for 3rd-party Xarray indexes. But maybe we want Mypy to fail and thereby encourage maintainers updating their code?

Comment on lines +369 to +378
exclude_dims : frozenset of hashable, optional
All the dimensions that are excluded from alignment, or None by default
(i.e., when this method is not called in the context of alignment).
For a n-dimensional index it allows ignoring any relevant dimension found
in ``exclude_dims`` when comparing this index with the other index.
For a 1-dimensional index it can be always safely ignored as this
method is not called when all of the index's dimensions are also excluded
from alignment
(note: the index's dimensions correspond to the union of the dimensions
of all coordinate variables associated with this index).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably be improved a bit.

@benbovy benbovy requested a review from dcherian May 7, 2025 12:18
@benbovy benbovy changed the title Alignment of n-dimensional indexes with partial excluded dimensions Alignment of n-dimensional indexes with partially excluded dims May 7, 2025
@benbovy benbovy mentioned this pull request May 7, 2025
6 tasks
Comment on lines +370 to +378
All the dimensions that are excluded from alignment, or None by default
(i.e., when this method is not called in the context of alignment).
For a n-dimensional index it allows ignoring any relevant dimension found
in ``exclude_dims`` when comparing this index with the other index.
For a 1-dimensional index it can be always safely ignored as this
method is not called when all of the index's dimensions are also excluded
from alignment
(note: the index's dimensions correspond to the union of the dimensions
of all coordinate variables associated with this index).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All the dimensions that are excluded from alignment, or None by default
(i.e., when this method is not called in the context of alignment).
For a n-dimensional index it allows ignoring any relevant dimension found
in ``exclude_dims`` when comparing this index with the other index.
For a 1-dimensional index it can be always safely ignored as this
method is not called when all of the index's dimensions are also excluded
from alignment
(note: the index's dimensions correspond to the union of the dimensions
of all coordinate variables associated with this index).
Dimensions excluded from checking. It is None by default,
(i.e., when this method is not called in the context of alignment).
For a n-dimensional index this option allows an Index to optionally,
ignore any dimension in ``exclude_dims`` when comparing
``self`` with ``other``. For a 1-dimensional index this kwarg can be safely ignored ,
as this method is not called when all of the index's dimensions are also excluded
from alignment (note: the index's dimensions correspond to the union of the dimensions
of all coordinate variables associated with this index).

f"the signature ``{index_cls_name}.equals(self, other)`` is deprecated. "
f"Please update it to "
f"``{index_cls_name}.equals(self, other, *, exclude_dims=None)`` "
"or kindly ask the maintainers doing it. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"or kindly ask the maintainers doing it. "
"or kindly ask the maintainers of ``{index_cls_name}`` to do it. "

otherwise they might come to us ;)

@@ -216,30 +216,37 @@ def _normalize_indexes(

normalized_indexes = {}
Copy link
Contributor

@dcherian dcherian May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment describing what "normalize" means here please? it would also be good to describe the purpose of each loop with a block comment. I can't figure out what's going on here

@@ -863,7 +882,7 @@ def sel(

return IndexSelResult({self.dim: indexer})

def equals(self, other: Index):
def equals(self, other: Index, *, exclude_dims: frozenset[Hashable] | None = None):
Copy link
Contributor

@dcherian dcherian May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def equals(self, other: Index, *, exclude_dims: frozenset[Hashable] | None = None):
def equals(self, other: Index, *, exclude: frozenset[Hashable] | None = None):

align calls this exclude. Shall we keep that name for consistency?

exclude_dims = idx_all_dims & self.exclude_dims
if exclude_dims == idx_all_dims:
continue
elif exclude_dims and self.join != "exact":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble deciding if any of the other cases make any sense. Seems OK to remove anything that requires actual reindexing, which leaves "override" and I'm not sure about that. Let's have the error message recommend the user to open an issue if they feel it should work.

Copy link
Contributor

@dcherian dcherian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK to me, but is definitely hard to reason about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support alignment with partial overlap between index vs. excluded dimensions
2 participants