Skip to content

BUG: sinc: fix underflow for float16 #27784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 3, 2024
Merged

BUG: sinc: fix underflow for float16 #27784

merged 12 commits into from
Dec 3, 2024

Conversation

lucascolley
Copy link
Contributor

While reviewing data-apis/array-api-extra#20, @jakirkham suggested these small improvements to the implementation of sinc.

If the change from 1.0e-20 to np.finfo(x.dtype).smallest_normal is classed as breaking, we can leave that out.

@lucascolley
Copy link
Contributor Author

Ah, the NumPy version accepts integer input, so we can't just use finfo. Maybe easiest to just leave the fill value as is.

@jakirkham
Copy link
Contributor

Seems like we should promote integers to float first. The end result from sinc will be a float

@lucascolley lucascolley force-pushed the sinc branch 2 times, most recently from b5938d6 to ff413a2 Compare November 18, 2024 12:35
Copy link
Contributor

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lucas! 🙏

FWIW this seems reasonable to me 🙂

@lucascolley
Copy link
Contributor Author

Multiplying by np.pi promotes lower precision dtypes to np.float64 anyway, so I removed the type promotion and now use the smallest normal of np.float64 for the fill value.

Copy link
Contributor

@jakevdp jakevdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding some new tests, (particularly for np.sinc(np.float16(0))) but the new implementation looks good to me!

Co-authored-by: jakirkham <jakirkham@gmail.com>
@lucascolley lucascolley force-pushed the sinc branch 2 times, most recently from fc87f1e to 1cc3faa Compare November 22, 2024 00:12
@lucascolley lucascolley changed the title MAINT: sinc: minor improvements BUG: sinc: fix underflow for float16 Nov 22, 2024
Co-authored-by: Jake Vanderplas <jakevdp@gmail.com>
Co-authored-by: jakirkham <jakirkham@gmail.com>
@charris charris changed the title BUG: sinc: fix underflow for float16 BUG: sinc: fix underflow for float16 Nov 22, 2024
Co-authored-by: jakirkham <jakirkham@gmail.com>
@lucascolley
Copy link
Contributor Author

finally green 😁

@jakirkham
Copy link
Contributor

Thanks Lucas and Jake! 🙏

FWIW LGTM

@seberg
Copy link
Member

seberg commented Nov 23, 2024

Thanks everyone!

If the change from 1.0e-20 to np.finfo(x.dtype).smallest_normal is classed as breaking

Can you say what this change changes? Either way, it seems to me like this should get a release note, and if that can briefly explain that change, then it will be easy to decide.

(I suspect it's fine. There was also an old PR that I never looked at enough, that moved sinc to C.)

@lucascolley
Copy link
Contributor Author

lucascolley commented Nov 23, 2024

If the change from 1.0e-20 to np.finfo(x.dtype).smallest_normal is classed as breaking

Can you say what this change changes? Either way, it seems to me like this should get a release note, and if that can briefly explain that change, then it will be easy to decide.

For float16, it fixes an underflow bug. For higher precision dtypes, it just makes the calculation more precise (better approximation to $0$). I don't think it should need a release note, unless we need to take into account people relying on less precise results?

@seberg
Copy link
Member

seberg commented Nov 23, 2024

OK, fair. It doesn't actually improve any precision anyway, it really just avoids the invalid value for exact 0./0., and the float16 result was obviously wrong before.

@lucascolley
Copy link
Contributor Author

It doesn't actually improve any precision anyway

Ah true, I suppose 1e20 is enough to get exactly 1.0 for higher precision.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but I think we need to do one more iteration unfortunately...

Additionally, to the above below comments/thoughts, can you please include longdouble in your test, since I am pretty sure it will fail on some architectures (not sure we have that in CI, though).

@@ -3827,7 +3827,8 @@ def sinc(x):

"""
x = np.asanyarray(x)
y = pi * where(x == 0, 1.0e-20, x)
x = x.astype(np.float64) if np.isdtype(x.dtype, ('bool', 'integral')) else x
Copy link
Member

@seberg seberg Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest just writing:

x = pi * x

Maybe we can even continue with (or where if you like):

y = sin(x)
x[x == 0] = 1  # or where, or copyto
return y / x

That way, we only guess at 1, which seems like a smaller guess? Although, I didn't quite see through whether spelling it may be slower/require more memory.

I was briefly considering using np.nextafter(..., 1) for the filling value, which fixes the issue of custom dtypes like ml_dtypes.bfloat16.
(But, we should likely np.finfo(...).smallest_normal, even without custom dtypes, it is currently slow and actually currently not defined for all longdoubles; although, both should probably be changed.)

Now, it turns out that bfloat16 currently returns float64 in the where (I am not quite sure why, I thought I fixed that!?).
But, I think it still fixes the problem, although it will change the result for ml_dtypes.bfloat64 to be float32 rather than float64 I believe (which may be a good thing, though).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was silly of course, we would have to copy things twice...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to think about it, maybe we can just fix it up later, although smallest_normal has some issues that make it very slow to use right now (and it may indeed fail).
finfo().eps works for all NumPy dtypes it seems, so maybe that is the better choice (if we use it for NumPy dtypes only anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seberg, sorry, I'm not sure what the conclusion is here as to how this PR should change - do you have a concrete suggestion?

Copy link
Member

@seberg seberg Nov 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would lean towards fixing the multiplication to remove the isdtype. Then use dtype.kind == "f" (assume that means finfo works) and go with finfo(dtype).eps because that seems to work as well, and (whether a thing that should be fixed or not) is currently vastly faster and won't have problems with longdouble on double-double architectures.

But, you need to sanity check that thought of course and maybe you have a better idea.

EDIT: If dtype != "f" you would just keep guessing at the old value for compatibility probably).

@lucascolley
Copy link
Contributor Author

looks like we might not need to use dtype=="f" @seberg ?

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Not sure that the 1e-20 for "not a float" is great, but unfortunately, maybe better than nothing.

Let's give it a shot. If anyone has worries about that fallback, I'll be happy to follow up.

@seberg
Copy link
Member

seberg commented Dec 3, 2024

I'll ignore that internal compiler error in circleCI... thanks again @lucascolley and everyone!

@seberg seberg merged commit 1356272 into numpy:main Dec 3, 2024
67 of 69 checks passed
@lucascolley lucascolley deleted the sinc branch December 3, 2024 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants