Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. #159197

laithsakka · 2025-07-26T00:19:01Z

Stack from ghstack (oldest at bottom):

[WIP] incomplete view unabcked fix to by pass vllm issue #159626
-> Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. #159197

This might cause some new DDEs on call sites that do not use is_contiguous_or_false() or sym_is_contiguous()
but want to find those call sites to handle this properly by calling is_contiguous_or_false() and not is_contiguous() explitly when appropriate.
I had to fix one issue after removing the implicit size oblivious reasoning. here is context

we defined in this #157472 sym_is_contiguous to be the function computing contiguity for dynamic shapes in c++. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE.

when people call is_contiguous we do sym_is_contiguous().guard_bool()
when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false()

one issue not handled well was this path

c10::SymBool TensorImpl::sym_is_contiguous_custom(
    at::MemoryFormat memory_format) const {
  if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
    return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(
        this, memory_format);
  }

  return sym_is_contiguous_default(memory_format);
}

namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format);

This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning.
once we removed that implicit size oblivious reasoning, the right thing we want is to call
return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format);
otherwise we would get DDE even if the caller is doing sym_is_contiguous.

so I had to define it for pyinterpreter, and then I had to override it for nested tensors.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela

[ghstack-poisoned]

pytorch-bot · 2025-07-26T00:19:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159197

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c42ba31 with merge base 316c188 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, linux.12xlarge, unstable) (gh) (#158876)
sccache: error: couldn't connect to server

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: c127523 Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: 1944f65 Pull-Request: #159197

This might cause some new DDEs on call sites that do not use is_contiguous_or_false but want to find those call sites to handle this properly by calling is_contiguous_or_false and not is_contiguous [ghstack-poisoned]

ghstack-source-id: 6c9ccfd Pull-Request: #159197

laithsakka · 2025-07-29T04:41:53Z

torch/_prims_common/__init__.py

        is_nested_int,
    )

-    maybe_guard_or_false = guard_or_false if false_if_dde else guard_size_oblivious
-    maybe_guard_or_true = guard_or_true if false_if_dde else guard_size_oblivious
+    def eval_eager(x):


shall we just do

if false_if_dde: return torch.ops.aten.sym_is_contiguous.default(input).guard_or_false here?

make debugging harder, but make impl simpler.

Does it make debugging harder? It early exits the false case and turns all the following guard_or_false/true below to just a simple "if", doesn't it?
(Same answer for is_channels_last_contiguous_2d)

[ghstack-poisoned]

ghstack-source-id: 5a576e4 Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: 0ce96bf Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: ce29072 Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: ed945bf Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: 23c5d2c Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: cd16d5d Pull-Request: #159197

[ghstack-poisoned]

ghstack-source-id: 1e0842b Pull-Request: #159197

laithsakka · 2025-07-29T23:13:53Z

c10/core/TensorImpl.cpp

@@ -313,7 +313,7 @@ void TensorImpl::throw_data_ptr_access_error() const {
 c10::SymBool TensorImpl::sym_is_contiguous_custom(
    at::MemoryFormat memory_format) const {
  if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) {
-    return pyobj_slot_.load_pyobj_interpreter()->is_contiguous(


calling this causes DDE. we need to delegate to sym_is_contiguous and define it in python for those that overrides it

zou3519 · 2025-07-31T14:53:42Z

aten/src/ATen/native/native_functions.yaml

+- func: sym_is_contiguous(Tensor self, MemoryFormat memory_format=contiguous_format) -> SymBool
+  variants: function
+  device_check: NoCheck
+  device_guard: False
+  tags: core
+  manual_cpp_binding: True


The operator seems fine to me. But:

Why are there changes to pyinterpreter?

Do you have more context over the how this solves the problem? It seems like there should have been a design somewhere

ok some context before this PR i will add that to the summary.

we defined in this PR sym_is_contiguous to be the function computing contiguity for dynamic shapes. It returns a symbolic expression that represents contiguity and guaranteed not to throw a DDE.

when people call is_contiguous we do sym_is_contiguous().guard_bool()
when people call is_contiguous_or_false we do sym_is_contiguous().guard_or_false()
one issue not handled well was this path

c10::SymBool TensorImpl::sym_is_contiguous_custom( at::MemoryFormat memory_format) const { if (C10_UNLIKELY(matches_python_custom(SizesStridesPolicy::CustomStrides))) { return pyobj_slot_.load_pyobj_interpreter()->is_contiguous( this, memory_format); } return sym_is_contiguous_default(memory_format); }

namely if we call sym_is_contiguous_custom but we have matches_python_custom(SizesStridesPolicy::CustomStrides) return true , then we used to call is_contiguous(this, memory_format);

This used to go through the load_pyobj_interpreter and end up calling the python is_contiguous call which used implicit size oblivious reasoning.
once we removed that implicit size oblivious reasoning, the right thing we want is to call
return pyobj_slot_.load_pyobj_interpreter()->sym_is_contiguous(this, memory_format);
otherwise we would get DDE even if the caller is doing sym_is_contiguous.

so I had to define it for pyinterpreter, and then I had to override it for nested tensors.
make sense?

@laithsakka I think you made the summary better but maybe it could be better yet - maybe liberally steal from the doc I shared w/ you.
@zou3519 Does Laith's summary update address your concerns? This looks generally fine to me but I don't want to approve without checking for your input first.

aorenste · 2025-08-12T18:48:28Z

tools/autograd/gen_python_functions.py

@@ -100,6 +100,7 @@
    "sym_size",
    "sym_stride",
    "sym_storage_offset",
+    "sym_is_contiguous",


please keep in alphabetical order

aorenste · 2025-08-12T18:50:25Z

test/functorch/test_vmap_registrations.py

@@ -209,6 +209,7 @@
    "aten::subtract_.Tensor",
    "aten::svd.U",
    "aten::sym_size.int",
+    "aten::sym_is_contiguous",


please keep in alpha order

aorenste · 2025-08-12T18:51:07Z

torch/_dynamo/convert_frame.py

@@ -1527,7 +1527,6 @@ def __call__(
        frame_state: dict[str, Union[int, FrameStateSizeEntry]],
    ) -> ConvertFrameReturn:
        assert frame_state is not None
-


unnecessary change

aorenste · 2025-08-12T18:52:40Z

torch/_prims_common/__init__.py

@@ -406,7 +406,7 @@ def is_channels_last_contiguous_or_false_3d(a: Tensor) -> bool:


 # similar to is_contiguous_for_memory_format but return false on data dependency.
-def contiguous_for_memory_format_or_false(  # type: ignore[return]


Do we need to worry about BC? I assume not since _prims_common starts with an underscore - but worth pointing out...

aorenste · 2025-08-12T18:54:05Z

torch/csrc/PyInterpreter.cpp

@@ -82,6 +82,8 @@ struct ConcretePyInterpreterVTable final

  bool is_contiguous(const c10::TensorImpl* self, at::MemoryFormat)
      const override;
+  c10::SymBool sym_is_contiguous(const c10::TensorImpl* self, at::MemoryFormat)


Maybe move down to be with the other sym_ functions?

aorenste · 2025-08-12T19:26:14Z

torch/masked/maskedtensor/_ops_refs.py

-@register_dispatch_func([torch.ops.aten.is_contiguous])
+@register_dispatch_func(
+    [torch.ops.aten.is_contiguous, torch.ops.aten.sym_is_contiguous]
+)


I was wondering why this wasn't just an infinite recursive call - but I think what's happening is that when the dispatcher is calling this it temporarily removes this from the dispatch options so the recursive call will then call .default instead of .maskedtensor.

(nit: I'm sure it doesn't matter but below instead of reconstructing the args doing return func(*args, **kwargs) would be faster...)

aorenste · 2025-08-12T19:51:57Z

torch/nested/_internal/nested_tensor.py

@@ -239,9 +239,20 @@ def __repr__(self):  # type: ignore[override]
        grad_fn_str = (
            f", requires_grad={self.requires_grad}" if self.requires_grad else ""
        )
+
+        def is_contiguous_or_false():


nit: I would make this a _is_contiguous_or_false member. As it is it has to allocate a new lambda everytime it hits this...

aorenste · 2025-08-12T19:56:52Z

torch/nested/_internal/ops.py

+
+    # If created from narrow() check for lengths
+    if inp.lengths() is not None:
+        return False


Maybe a quick comment why having lengths implies !contiguous

aorenste · 2025-08-12T20:02:23Z

torch/_prims_common/__init__.py

        is_nested_int,
    )

-    maybe_guard_or_false = guard_or_false if false_if_dde else guard_size_oblivious
-    maybe_guard_or_true = guard_or_true if false_if_dde else guard_size_oblivious
+    def eval_eager(x):


Does it make debugging harder? It early exits the false case and turns all the following guard_or_false/true below to just a simple "if", doesn't it?
(Same answer for is_channels_last_contiguous_2d)

aorenste · 2025-08-12T20:06:04Z

aten/src/ATen/native/native_functions.yaml

+- func: sym_is_contiguous(Tensor self, MemoryFormat memory_format=contiguous_format) -> SymBool
+  variants: function
+  device_check: NoCheck
+  device_guard: False
+  tags: core
+  manual_cpp_binding: True


@laithsakka I think you made the summary better but maybe it could be better yet - maybe liberally steal from the doc I shared w/ you.
@zou3519 Does Laith's summary update address your concerns? This looks generally fine to me but I don't want to approve without checking for your input first.

Update

7560e7d

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 26, 2025

remove default gso from normal contiguity checks

e6f6706

ghstack-source-id: c127523 Pull-Request: #159197

laithsakka requested review from bobrenjc93, pianpwk and ColinPeppler July 26, 2025 00:20

laithsakka marked this pull request as draft July 26, 2025 05:23

Update

16ed811

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 27, 2025

remove default gso from normal contiguity checks

2f0045e

ghstack-source-id: 1944f65 Pull-Request: #159197

pytorch-bot bot added the release notes: fx release notes category label Jul 27, 2025

laithsakka added the keep-going Don't stop on first failure, keep running tests until the end label Jul 28, 2025

Update on "remove default gso from normal contiguity checks"

e179f3b

This might cause some new DDEs on call sites that do not use is_contiguous_or_false but want to find those call sites to handle this properly by calling is_contiguous_or_false and not is_contiguous [ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

949d2e7

ghstack-source-id: 6c9ccfd Pull-Request: #159197

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

c3dfa64

ghstack-source-id: 6c9ccfd Pull-Request: #159197

laithsakka commented Jul 29, 2025

View reviewed changes

Update

9f43dac

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

eaa4146

ghstack-source-id: 5a576e4 Pull-Request: #159197

laithsakka changed the title ~~remove default gso from normal contiguity checks~~ remove default guard size oblivuous from normal contiguity checks and add aten.sym_is_contiguous. Jul 29, 2025

Update

ef7728b

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

2a34f47

ghstack-source-id: 0ce96bf Pull-Request: #159197

Update

72f0bd9

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

0a58860

ghstack-source-id: ce29072 Pull-Request: #159197

pytorch-bot bot added ciflow/inductor module: dynamo labels Jul 29, 2025

Update

afc1833

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

2e67f01

ghstack-source-id: ed945bf Pull-Request: #159197

Update

2271a34

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

02c444c

ghstack-source-id: 23c5d2c Pull-Request: #159197

laithsakka changed the title ~~remove default guard size oblivuous from normal contiguity checks and add aten.sym_is_contiguous.~~ remove default guard size oblivuous from normal contiguity python checks and add aten.sym_is_contiguous. Jul 29, 2025

laithsakka changed the title ~~remove default guard size oblivuous from normal contiguity python checks and add aten.sym_is_contiguous.~~ Remove default guard size oblivious from normal contiguity python checks and add aten.sym_is_contiguous. Jul 29, 2025

Update

ddc60e7

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

2dd852c

ghstack-source-id: cd16d5d Pull-Request: #159197

Update

c42ba31

[ghstack-poisoned]

laithsakka added a commit that referenced this pull request Jul 29, 2025

remove default gso from normal contiguity checks

2055f5e

ghstack-source-id: 1e0842b Pull-Request: #159197

laithsakka marked this pull request as ready for review July 29, 2025 23:13

laithsakka requested review from albanD and soulitzer as code owners July 29, 2025 23:13

laithsakka commented Jul 29, 2025

View reviewed changes

laithsakka changed the title ~~Remove default guard size oblivious from normal contiguity python checks and add aten.sym_is_contiguous.~~ Remove guard_size_oblivious from default contiguity check python check, and add aten.sym_is_contiguous. Jul 29, 2025

laithsakka requested a review from zou3519 July 30, 2025 00:45

laithsakka changed the title ~~Remove guard_size_oblivious from default contiguity check python check, and add aten.sym_is_contiguous.~~ Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. Jul 30, 2025

zou3519 reviewed Jul 31, 2025

View reviewed changes

laithsakka requested a review from zou3519 July 31, 2025 19:17

laithsakka mentioned this pull request Aug 1, 2025

[WIP] incomplete view unabcked fix to by pass vllm issue #159626

Draft

laithsakka requested review from ezyang and aorenste August 1, 2025 16:03

laithsakka requested a review from jansel August 9, 2025 17:28

aorenste reviewed Aug 12, 2025

View reviewed changes

		@@ -406,7 +406,7 @@ def is_channels_last_contiguous_or_false_3d(a: Tensor) -> bool:


		# similar to is_contiguous_for_memory_format but return false on data dependency.
		def contiguous_for_memory_format_or_false( # type: ignore[return]

Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. #159197

Are you sure you want to change the base?

Remove guard_size_oblivious from default contiguity python check, and add aten.sym_is_contiguous. #159197

Uh oh!

Conversation

laithsakka commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159197

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

laithsakka Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

laithsakka Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

laithsakka commented Jul 26, 2025 •

edited

Loading

pytorch-bot bot commented Jul 26, 2025 •

edited

Loading

laithsakka Jul 29, 2025 •

edited

Loading

laithsakka Jul 29, 2025 •

edited

Loading