Skip to content

Conversation

m-clare
Copy link
Contributor

@m-clare m-clare commented Jul 31, 2025

This PR adds back in UBSAN within CI, with jobs against ubuntu-latest and macos-latest and corresponding suppression files for each architecture. (fixes #24209)

The suppression files are grouped by check type, and include suggestions for fixing behavior at a later date.

Most suppression entries are file specific, apart from the random module in arm64. random has many Indirect call of a function through a function pointer of the wrong type errors, and will require further investigation.

@charris charris changed the title BUG: Add architecture specific suppression files for UBSAN BUG: Add UBSAN to CI checks Jul 31, 2025
@charris
Copy link
Member

charris commented Jul 31, 2025

Looks good at first view. I can't think of a better place to add the suppression files.

@ngoldbaum
Copy link
Member

ngoldbaum commented Aug 1, 2025

This is awesome! I think this is a good idea and I don't see any issues in the new github actions configuration. After this is merged we should probably open a tracking issue for any followups for any suppressions we think are real bugs that should be looked at closer.

One idea: maybe we can consolidate the two UBSAN jobs and one ASAN job we'd have if this PR is merged into two jobs that both simultaneously run ASAN and UBSAN. It looks like meson supports this: https://mesonbuild.com/Builtin-options.html#base-options

IMO given the different suppressions you had to write on linux and macos, I think it is worth testing both but that does mean adding at least one new CI job.

I asked Ralf to look at this if he has time since he's good at spotting issues in CI configuration.

@m-clare
Copy link
Contributor Author

m-clare commented Aug 2, 2025

@ngoldbaum went ahead and consolidated job clang_ubsan and clang_asan into clang_asan_ubsan. Can back out into separate jobs if requested again.

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saw a few issues

@m-clare m-clare requested a review from ngoldbaum August 4, 2025 19:33
@ngoldbaum
Copy link
Member

ngoldbaum commented Aug 4, 2025

OK, I think this is looking good now.

My only reservation is that the new CI job takes 36 minutes to complete. Compare that with the less than 10 minutes the MacOS sanitizer job needs.

I’m not sure if a slow CI job that is mostly redundant with the Mac sanitizer job is worth adding. It does test code that doesn’t run on macs, but I’m not sure how much of that there is.

Before @mclare does anything else here I’d appreciate it if one or more of @seberg @charris or @rgommers could weigh in about how to handle the new CI job.

@charris
Copy link
Member

charris commented Aug 4, 2025

I'd be happy to run these sorts of lengthy tests on a weekly basis if there were a good way to insure someone notices if there is a problem. Not sure how well the tests identify the bad code. Running on PRs does have the advantage of narrowing down the problematic code. Apart from that, it would be nice to know precisely what this offers in the way of code coverage and completeness that TSAN doesn't. They are both clang based, correct?

@ngoldbaum
Copy link
Member

ngoldbaum commented Aug 5, 2025

Apart from that, it would be nice to know precisely what this offers in the way of code coverage and completeness that TSAN doesn't. They are both clang based, correct?

TSAN, ASAN, and UBSAN are all sanitizers in the clang and gcc compiler sanitizer suites. Each suite of sanitizers implements different checks. TSAN is for thread safety checks, ASAN is for memory safety, and UBSAN is for C undefined behavior checks.

All three are useful to run on PRs because they have disjoint sets of checks.

What's not clear on this PR is if it's worth adding checks using clang on x86_64 ubuntu as well as on ARM64 MacOS. Since we're doing both ASAN and UBSAN on both sets of checks, the only reason for running both is for checking OS-specific code for memory safety and undefined behavior.

@ngoldbaum
Copy link
Member

See the clang docs for mire details. There are lots more sanitizers that we haven't even tried to use yet.

@rgommers rgommers changed the title BUG: Add UBSAN to CI checks CI: Add UBSAN CI jobs for macOS arm64 and Linux x86-64 Aug 5, 2025
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, largely LGTM, thanks @m-clare.

I’m not sure if a slow CI job that is mostly redundant with the Mac sanitizer job is worth adding. It does test code that doesn’t run on macs, but I’m not sure how much of that there is.

The suppressions files for the two platforms look different enough that I think it's worth having both.

I'd be happy to run these sorts of lengthy tests on a weekly basis if there were a good way to insure someone notices if there is a problem.

There isn't really. I think we should run this on every PR, at least for a while until we have a better sense of how much it catches. For this job, building takes longer than testing, we can optimize that fairly easily with ccache. We have room to do so, our GHA caches are under the limit and ccache will help much more here than on that macOS job it's now used in: https://github.com/numpy/numpy/actions/caches.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks!

I'll leave the merge to @ngoldbaum. I suggest going ahead now and dealing with caching if the runtime of this job becomes annoying.

Copy link
Member

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright let's bring this in. @m-clare might you be interested in opening a followup issue that summarizes any followup needed to fix UB that we want to fix and is suppressed here? Otherwise I'll add it to my backlog of tasks...

Thanks so much for pushing on this, preventing new UB from showing up in NumPy's C/C++ code under testing will be extremely valuable.

@ngoldbaum ngoldbaum merged commit 8a153fa into numpy:main Aug 6, 2025
77 checks passed
@m-clare
Copy link
Contributor Author

m-clare commented Aug 6, 2025

@ngoldbaum yeah I can work on that tonight, I'll set up an issue with a checklist and mirror some of the language in the suppressions themselves.

@m-clare
Copy link
Contributor Author

m-clare commented Aug 7, 2025

@ngoldbaum see here: #29524

@ngoldbaum
Copy link
Member

Thanks so much! You've left things in a position where people can work on these issues one at a time, which is perfect for sprints and other drive-by contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: undefined behavior detected by ubsan in sanitizer CI runs
4 participants