[NVGPU] Fix nvdsl examples #156830

castigli · 2025-09-04T08:52:37Z

This PR aims at fixing the nvdsl examples which got a bit out of sync not being tested in the CI.

The fixed bugs were related to the following PRs:

move to nanobind [mlir python] Port Python core code to nanobind. #118583
split gpu module initialization [mlir][gpu] Change GPU modules to globals #135478

There is one remaining bug that I think #153134 introduced. When running the Ch4 and Ch5 the nvvm.prefetch tensormap intrisic leads to the following error on sm_90a

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.prefetch.tensormap
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: mlir-opt before.mlir --gpu-module-to-binary
1.	Running pass 'Function Pass Manager' on module 'LLVMDialectModule'.
2.	Running pass 'NVPTX DAG->DAG Pattern Instruction Selection' on function '@gemm_multistage_kernel'
...

Perahps @Wolfram70 or @grypp could help me out with the last bug?
Could be the solution to revert momentarily to inline ptx?

github-actions · 2025-09-04T08:52:56Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Wolfram70 · 2025-09-04T18:51:13Z

Thanks for bringing this to our attention!

I looked into this a bit, and it does look like this crash is occurring in the --gpu-module-to-binary pass.
For Ch4.py, dumping the .mlir file and extracting the LLVMIR generated during --gpu-module-to-binary, we get
extracted-llvmir.txt. Running llc -mtriple=nvptx64 -mcpu=sm_90a -mattr=+ptx80 on this reproduces this crash exactly so it seems to be an issue during codegen. I am not sure why this is occurring in this specific case.

@durga4github @abhilash1910 Do you have any idea why this might be happening?

abhilash1910 · 2025-09-05T03:55:59Z

Taking a look at the codegen. Thanks for highlighting. IR does not seem incorrect though at first glance.
Edit: Fix is in progress.

fix nvdsl

e904099

castigli changed the title ~~Fix nvdsl examples~~ [NVGPU] Fix nvdsl examples Sep 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVGPU] Fix nvdsl examples #156830

[NVGPU] Fix nvdsl examples #156830

Uh oh!

castigli commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

Wolfram70 commented Sep 4, 2025

Uh oh!

abhilash1910 commented Sep 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

[NVGPU] Fix nvdsl examples #156830

Are you sure you want to change the base?

[NVGPU] Fix nvdsl examples #156830

Uh oh!

Conversation

castigli commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

Wolfram70 commented Sep 4, 2025

Uh oh!

abhilash1910 commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

castigli commented Sep 4, 2025 •

edited

Loading

abhilash1910 commented Sep 5, 2025 •

edited

Loading