Skip to content

Conversation

lysnikolaou
Copy link
Member

Closes #25565.


adjust_offsets(&start, &end, len1);
if (end - start < len2) {
if (end - start < static_cast<npy_int64>(len2)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat flyby, but could one write this as end < start + len2 and avoid the distracting static_cast?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the signedness of start + len2 is implementation-specific and depends on the size of size_t. For implementations where it's a 64-bit integer, the resulting type is also unsigned, so we end up with signed-unsigned comparison again.

If the unsigned type has conversion rank greater than or equal to the rank of the signed type, then the operand with the signed type is implicitly converted to the unsigned type.

  1. the ranks of all signed integer types are different and increase with their precision: rank of signed char < rank of short < rank of int < rank of long int < rank of long long int
  2. the ranks of all signed integer types equal the ranks of the corresponding unsigned integer types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, maybe explicit is better, then...

@ngoldbaum
Copy link
Member

Looks like there's a heap buffer overflow in rstrip. Also woohoo that's the first time I've seen the compiler sanitizer job catch a bug like that.

@mhvk
Copy link
Contributor

mhvk commented Jan 11, 2024

The sanitizer actually did a really nice job on my fft via gufunc PR (#25536). Though it still took me quite a while to figure out where I was writing a byte too many! But very nice that it found it, since everything else just passed fine.

@lysnikolaou lysnikolaou force-pushed the num_codepoints-use-size_t branch 3 times, most recently from 339c6e5 to c347ca5 Compare January 12, 2024 10:59
@lysnikolaou lysnikolaou force-pushed the num_codepoints-use-size_t branch from c347ca5 to 67d932d Compare January 12, 2024 11:14
@lysnikolaou
Copy link
Member Author

The test failures are unrelated as far as I can tell.

@mhvk
Copy link
Contributor

mhvk commented Jan 12, 2024

Rerunning the pypy job just in case. Is it failing also for other PRs?

@lysnikolaou
Copy link
Member Author

I don't see any other failures, no, but I think it was a very early segfault in pypy code before the build even began. Could you maybe rerun the OpenSUSE Netlib BLAS/LAPACK one?

@mhvk
Copy link
Contributor

mhvk commented Jan 12, 2024

Ah, sorry, missed that there was another failure. Let's see what a rerun does.

@lysnikolaou
Copy link
Member Author

Yup, they're both okay now.

@ngoldbaum
Copy link
Member

There seems to be a new 60 minute build time limit we're hitting in azure, it's unrelated to this PR.

@ngoldbaum
Copy link
Member

Thanks for this!

@ngoldbaum ngoldbaum merged commit 05d3e81 into numpy:main Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MAINT: Return size_t from num_codepoints in Buffer class in string ufuncs
3 participants