Fallback to contiguous layout in convolution lowering on stride mismatch #159462 #159593

kanavgoyal898 · 2025-07-31T21:57:49Z

Fallback to `.contiguous()` Layout When `require_stride_order` Fails in Convolution Lowering

This PR fixes a stride validation error in the Inductor backend that occurs when a permute() is followed by a Conv1d layer. The issue happens because permute() creates a tensor with a non-standard layout, which breaks the expected stride checks during kernel compilation.

In the aten.convolution lowering function, this patch wraps require_stride_order(...) with a try/except. If the stride requirement check fails, it safely falls back to require_contiguous(...). This resolves the mismatch by ensuring a compatible memory layout.

Code:

import torch
import torch.nn as nn

import warnings
warnings.filterwarnings("ignore", message=".*TF32.*deprecated.*")
warnings.filterwarnings("ignore", message=".*Please use the new API settings.*")

class ConvModel(nn.Module):  
    def __init__(self):  
        super().__init__()  
        self.conv = nn.Conv1d(1, 64, kernel_size=3, padding=1)  
      
    def forward(self, x):  
        x = x.permute(0, 2, 1)
        return self.conv(x)


model = ConvModel()
x = torch.randn(32, 100, 1, dtype=torch.float32)

def run_test(model, input, backend):
    try:
        model = torch.compile(model, backend=backend)
        output = model(*input)
        print(f"succeed on {backend}")
    except Exception as e:
        print(f"failed on {backend}", str(e))
        

run_test(model, [x], "eager")
run_test(model, [x], "aot_eager")
run_test(model, [x], "inductor")

Output:

succeed on eager
succeed on aot_eager
succeed on inductor

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

pytorch-bot · 2025-07-31T21:57:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159593

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ghstack-mergeability-check and Check labels failing with 'Resource not accessible by integration'

❌ 1 New Failure

As of commit 8810e26 with merge base a991e28 ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-07-31T21:58:54Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

RohitRathore1

For me it's still failing. I build it on top of your fix which is this commit 850db0c......

python3
Python 3.13.5 | packaged by conda-forge | (main, Jun 16 2025, 08:27:50) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.9.0a0+git850db0c
>>> exit
(myenv) rohitrathore1@hpe10:~$ python3 test.py
succeed on eager
succeed on aot_eager
failed on inductor expected size 64==64, stride 1==100 at dim=1; expected size 100==100, stride 64==1 at dim=2
Error in op: torch.ops.aten.convolution.default
This error most often comes from a incorrect fake (aka meta) kernel for a custom op.
Use torch.library.opcheck to test your custom op.
See https://pytorch.org/docs/stable/library.html#torch.library.opcheck

kanavgoyal898 · 2025-08-01T13:01:11Z

@RohitRathore1
Thanks for checking! I just re-ran the test on my local machine using the exact same commit (850db0cb) and it passes on all three backends, including inductor. I've attached the full output below for reference.

It could be platform-dependent, but since the operation uses only standard PyTorch modules on CPU, and passes on macOS with a clean build, it doesn’t immediately appear to be platform-specific.

(venv) kanavgoyal@MacBook-Pro pytorch % python 
Python 3.13.0 (main, Oct  7 2024, 05:02:14) [Clang 16.0.0 (clang-1600.0.26.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
... import torch.nn as nn
... 
... import warnings
... warnings.filterwarnings("ignore", message=".*TF32.*deprecated.*")
... warnings.filterwarnings("ignore", message=".*Please use the new API settings.*")
... 
... class ConvModel(nn.Module):  
...     def __init__(self):  
...         super().__init__()  
...         self.conv = nn.Conv1d(1, 64, kernel_size=3, padding=1)  
...       
...     def forward(self, x):  
...         x = x.permute(0, 2, 1)
...         return self.conv(x)
... 
... 
... model = ConvModel()
... x = torch.randn(32, 100, 1, dtype=torch.float32)
... 
... def run_test(model, input, backend):
...     try:
...         model = torch.compile(model, backend=backend)
...         output = model(*input)
...         print(f"succeed on {backend}")
...     except Exception as e:
...         print(f"failed on {backend}", str(e))
...         
... 
... run_test(model, [x], "eager")
... run_test(model, [x], "aot_eager")
... run_test(model, [x], "inductor")
... 
/Users/kanavgoyal/Downloads/pytorch/torch/_dynamo/guards.py:787: RuntimeWarning: Guards may run slower on Python 3.13.0. Consider upgrading to Python 3.13.1+.
  warnings.warn(
succeed on eager
/Users/kanavgoyal/Downloads/pytorch/torch/_dynamo/guards.py:787: RuntimeWarning: Guards may run slower on Python 3.13.0. Consider upgrading to Python 3.13.1+.
  warnings.warn(
succeed on aot_eager
/Users/kanavgoyal/Downloads/pytorch/torch/_dynamo/guards.py:787: RuntimeWarning: Guards may run slower on Python 3.13.0. Consider upgrading to Python 3.13.1+.
  warnings.warn(
succeed on inductor
>>> torch.__version__
'2.9.0a0+git850db0c'
>>> torch.utils.collect_env.get_pretty_env_info()
'PyTorch version: 2.9.0a0+git850db0c\nIs debug build: False\nCUDA used to build PyTorch: None\nROCM used to build PyTorch: N/A\n\nOS: macOS 15.5 (arm64)\nGCC version: Could not collect\nClang version: 17.0.0 (clang-1700.0.13.5)\nCMake version: version 4.0.3\nLibc version: N/A\n\nPython version: 3.13.0 (main, Oct  7 2024, 05:02:14) [Clang 16.0.0 (clang-1600.0.26.3)] (64-bit runtime)\nPython platform: macOS-15.5-arm64-arm-64bit-Mach-O\nIs CUDA available: False\nCUDA runtime version: No CUDA\nCUDA_MODULE_LOADING set to: N/A\nGPU models and configuration: No CUDA\nNvidia driver version: No CUDA\ncuDNN version: No CUDA\nIs XPU available: False\nHIP runtime version: N/A\nMIOpen runtime version: N/A\nIs XNNPACK available: True\n\nCPU:\nApple M3 Pro\n\nVersions of relevant libraries:\n[pip3] numpy==2.3.2\n[pip3] optree==0.17.0\n[pip3] torch==2.9.0a0+git3967dbe\n[pip3] torch==2.9.0a0+git3967dbe\n[pip3] torch==2.9.0a0+git850db0c\n[conda] Could not collect'
>>> exit()

cc @eellison @soulitzer

pytorch-bot · 2025-08-01T13:01:13Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'can' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

eellison

We can't do try ... except here. We should figure out what the actual underlying issue is. Sorry this may not be a good first issue

…nsors

Fix inductor Conv1d permute handling pytorch#159462

850db0c

pytorch-bot bot added the module: inductor label Jul 31, 2025

pytorchbot added the open source label Jul 31, 2025

RohitRathore1 reviewed Aug 1, 2025

View reviewed changes

janeyx99 requested a review from eellison August 4, 2025 23:08

janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 4, 2025

eellison reviewed Aug 5, 2025

View reviewed changes

Fix stride validation error in convert_1x1_conv_to_mm for permuted te…

8810e26

…nsors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fallback to contiguous layout in convolution lowering on stride mismatch #159462 #159593

Fallback to contiguous layout in convolution lowering on stride mismatch #159462 #159593

kanavgoyal898 commented Jul 31, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

RohitRathore1 left a comment

Uh oh!

kanavgoyal898 commented Aug 1, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 1, 2025

Uh oh!

eellison left a comment

Uh oh!

Uh oh!

Fallback to contiguous layout in convolution lowering on stride mismatch #159462 #159593

Are you sure you want to change the base?

Fallback to contiguous layout in convolution lowering on stride mismatch #159462 #159593

Conversation

kanavgoyal898 commented Jul 31, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fallback to .contiguous() Layout When require_stride_order Fails in Convolution Lowering

Uh oh!

pytorch-bot bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159593

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

github-actions bot commented Jul 31, 2025

This PR needs a release notes: label

Uh oh!

RohitRathore1 left a comment

Choose a reason for hiding this comment

Uh oh!

kanavgoyal898 commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 1, 2025

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kanavgoyal898 commented Jul 31, 2025 •

edited by pytorch-bot bot

Loading

Fallback to `.contiguous()` Layout When `require_stride_order` Fails in Convolution Lowering

pytorch-bot bot commented Jul 31, 2025 •

edited

Loading

This PR needs a `release notes:` label

kanavgoyal898 commented Aug 1, 2025 •

edited

Loading