Skip to content

Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. #159197

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: gh/laithsakka/249/base
Choose a base branch
from

Conversation

laithsakka
Copy link
Contributor

@laithsakka laithsakka commented Jul 26, 2025

Stack from ghstack (oldest at bottom):

This might cause some new DDEs on call sites that do not use is_contiguous_or_false() or sym_is_contiguous()
but want to find those call sites to handle this properly by calling is_contiguous_or_false() and not is_contiguous() explitly when appropriate.
I had to fix one issue after removing the implicit size oblivious reasoning. here is context

we defined in this #157472 sym_is_contiguous to be the function computing contiguity for dynamic shapes in c++. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE.

when people call is_contiguous we do sym_is_contiguous().guard_bool()
when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false()

one issue not handled well was this path

c10::SymBool TensorImpl::sym_is_contiguous_custom(
    at::MemoryFormat memory_format) const {
  if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
    return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(
        this, memory_format);
  }

  return sym_is_contiguous_default(memory_format);
}

namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format);

This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning.
once we removed that implicit size oblivious reasoning, the right thing we want is to call
return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format);
otherwise we would get DDE even if the caller is doing sym_is_contiguous.

so I had to define it for pyinterpreter, and then I had to override it for nested tensors.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jul 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159197

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c42ba31 with merge base 316c188 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

laithsakka added a commit that referenced this pull request Jul 26, 2025
@laithsakka laithsakka marked this pull request as draft July 26, 2025 05:23
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 27, 2025
@pytorch-bot pytorch-bot bot added the release notes: fx release notes category label Jul 27, 2025
@laithsakka laithsakka added the keep-going Don't stop on first failure, keep running tests until the end label Jul 28, 2025
This might cause some new DDEs on call sites that do not use is_contiguous_or_false
but want to find those call sites to handle this properly by calling  is_contiguous_or_false and not is_contiguous

[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
laithsakka added a commit that referenced this pull request Jul 29, 2025
is_nested_int,
)

maybe_guard_or_false = guard_or_false if false_if_dde else guard_size_oblivious
maybe_guard_or_true = guard_or_true if false_if_dde else guard_size_oblivious
def eval_eager(x):
Copy link
Contributor Author

@laithsakka laithsakka Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we just do

if false_if_dde:
   return torch.ops.aten.sym_is_contiguous.default(input).guard_or_false here?

make debugging harder, but make impl simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make debugging harder? It early exits the false case and turns all the following guard_or_false/true below to just a simple "if", doesn't it?
(Same answer for is_channels_last_contiguous_2d)

[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
@laithsakka laithsakka changed the title remove default gso from normal contiguity checks remove default guard size oblivuous from normal contiguity checks and add aten.sym_is_contiguous. Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
@laithsakka laithsakka changed the title remove default guard size oblivuous from normal contiguity checks and add aten.sym_is_contiguous. remove default guard size oblivuous from normal contiguity python checks and add aten.sym_is_contiguous. Jul 29, 2025
@laithsakka laithsakka changed the title remove default guard size oblivuous from normal contiguity python checks and add aten.sym_is_contiguous. Remove default guard size oblivious from normal contiguity python checks and add aten.sym_is_contiguous. Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Jul 29, 2025
@laithsakka laithsakka marked this pull request as ready for review July 29, 2025 23:13
@@ -313,7 +313,7 @@ void TensorImpl::throw_data_ptr_access_error() const {
c10::SymBool TensorImpl::sym_is_contiguous_custom(
at::MemoryFormat memory_format) const {
if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(
Copy link
Contributor Author

@laithsakka laithsakka Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling this causes DDE. we need to delegate to sym_is_contiguous and define it in python for those that overrides it

@laithsakka laithsakka changed the title Remove default guard size oblivious from normal contiguity python checks and add aten.sym_is_contiguous. Remove guard_size_oblivious from default contiguity check python check, and add aten.sym_is_contiguous. Jul 29, 2025
@laithsakka laithsakka requested a review from zou3519 July 30, 2025 00:45
@laithsakka laithsakka changed the title Remove guard_size_oblivious from default contiguity check python check, and add aten.sym_is_contiguous. Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. Jul 30, 2025
Comment on lines +5499 to +5504
- func: sym_is_contiguous(Tensor self, MemoryFormat memory_format=contiguous_format) -> SymBool
variants: function
device_check: NoCheck
device_guard: False
tags: core
manual_cpp_binding: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator seems fine to me. But:

  1. Why are there changes to pyinterpreter?
  2. Do you have more context over the how this solves the problem? It seems like there should have been a design somewhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok some context before this PR i will add that to the summary.

we defined in this PR sym_is_contiguous to be the function computing contiguity for dynamic shapes. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE.

when people call is_contiguous we do sym_is_contiguous().guard_bool()
when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false()
one issue not handled well was this path

c10::SymBool TensorImpl::sym_is_contiguous_custom(
    at::MemoryFormat memory_format) const {
  if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
    return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(
        this, memory_format);
  }

  return sym_is_contiguous_default(memory_format);
}

namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format);

This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning.
once we removed that implicit size oblivious reasoning, the right thing we want is to call
return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format);
otherwise we would get DDE even if the caller is doing sym_is_contiguous.

so I had to define it for pyinterpreter, and then I had to override it for nested tensors.
make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laithsakka I think you made the summary better but maybe it could be better yet - maybe liberally steal from the doc I shared w/ you.
@zou3519 Does Laith's summary update address your concerns? This looks generally fine to me but I don't want to approve without checking for your input first.

@@ -100,6 +100,7 @@
"sym_size",
"sym_stride",
"sym_storage_offset",
"sym_is_contiguous",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep in alphabetical order

@@ -209,6 +209,7 @@
"aten::subtract_.Tensor",
"aten::svd.U",
"aten::sym_size.int",
"aten::sym_is_contiguous",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep in alpha order

@@ -1527,7 +1527,6 @@ def __call__(
frame_state: dict[str, Union[int, FrameStateSizeEntry]],
) -> ConvertFrameReturn:
assert frame_state is not None

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change

@@ -406,7 +406,7 @@ def is_channels_last_contiguous_or_false_3d(a: Tensor) -> bool:


# similar to is_contiguous_for_memory_format but return false on data dependency.
def contiguous_for_memory_format_or_false( # type: ignore[return]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about BC? I assume not since _prims_common starts with an underscore - but worth pointing out...

@@ -82,6 +82,8 @@ struct ConcretePyInterpreterVTable final

bool is_contiguous(const c10::TensorImpl* self, at::MemoryFormat)
const override;
c10::SymBool sym_is_contiguous(const c10::TensorImpl* self, at::MemoryFormat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move down to be with the other sym_ functions?

@register_dispatch_func([torch.ops.aten.is_contiguous])
@register_dispatch_func(
[torch.ops.aten.is_contiguous, torch.ops.aten.sym_is_contiguous]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why this wasn't just an infinite recursive call - but I think what's happening is that when the dispatcher is calling this it temporarily removes this from the dispatch options so the recursive call will then call .default instead of .maskedtensor.

(nit: I'm sure it doesn't matter but below instead of reconstructing the args doing return func(*args, **kwargs) would be faster...)

@@ -239,9 +239,20 @@ def __repr__(self): # type: ignore[override]
grad_fn_str = (
f", requires_grad={self.requires_grad}" if self.requires_grad else ""
)

def is_contiguous_or_false():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would make this a _is_contiguous_or_false member. As it is it has to allocate a new lambda everytime it hits this...


# If created from narrow() check for lengths
if inp.lengths() is not None:
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a quick comment why having lengths implies !contiguous

is_nested_int,
)

maybe_guard_or_false = guard_or_false if false_if_dde else guard_size_oblivious
maybe_guard_or_true = guard_or_true if false_if_dde else guard_size_oblivious
def eval_eager(x):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make debugging harder? It early exits the false case and turns all the following guard_or_false/true below to just a simple "if", doesn't it?
(Same answer for is_channels_last_contiguous_2d)

Comment on lines +5499 to +5504
- func: sym_is_contiguous(Tensor self, MemoryFormat memory_format=contiguous_format) -> SymBool
variants: function
device_check: NoCheck
device_guard: False
tags: core
manual_cpp_binding: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laithsakka I think you made the summary better but maybe it could be better yet - maybe liberally steal from the doc I shared w/ you.
@zou3519 Does Laith's summary update address your concerns? This looks generally fine to me but I don't want to approve without checking for your input first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor keep-going Don't stop on first failure, keep running tests until the end module: dynamo release notes: fx release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants