Skip to content

Collection of branchless algorithms from Hacker's Delight. #5202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 28, 2025

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Jun 21, 2025

As well as some related improvements.

While we're there, we also normalize the shape of overflow checks in Math.xExact methods.

@sjrd sjrd requested a review from gzm0 June 21, 2025 15:13
@sjrd sjrd force-pushed the hackers-delight-branchless-magic branch from dfd5b59 to 2f23075 Compare June 24, 2025 11:33
@sjrd sjrd force-pushed the hackers-delight-branchless-magic branch from 2f23075 to 0d0b69f Compare July 9, 2025 13:34
Copy link
Contributor

@gzm0 gzm0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple of minor suggestions. Mostly on comments.

@@ -765,6 +765,10 @@ object RuntimeLong {

@inline
def clz(a: RuntimeLong): Int = {
/* Warning to the next adventurer to come here: the best branchless
* algorithm I found was worse than the naive implementation here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add how it was worse? Performance wise? Or number of operations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance-wise. I added that to the comment.

@@ -765,6 +765,10 @@ object RuntimeLong {

@inline
def clz(a: RuntimeLong): Int = {
/* Warning to the next adventurer to come here: the best branchless
* algorithm I found was worse than the naive implementation here.
* The algorithm was `val hiz = nlz(hi); hiz + ((hiz << 26 >> 31) & nlz(lo))`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The algorithm was `val hiz = nlz(hi); hiz + ((hiz << 26 >> 31) & nlz(lo))`.
* The algorithm was `val hiz = clz(hi); hiz + ((hiz << 26 >> 31) & clz(lo))`.

Seems we always abbreviate that way? Only the JDK calls it numberOfLeadingZeros.

@inline def signum(i: scala.Int): scala.Int =
if (i == 0) 0 else if (i < 0) -1 else 1
@inline def signum(i: scala.Int): scala.Int = {
// Hacker's Delight, Section 2-7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Hacker's Delight, Section 2-7
// Hacker's Delight, Section 2-8

? (I'm looking at second edition: https://doc.lagout.org/security/Hackers%20Delight.pdf)

if (hi < 0) -1
else if (hi == 0 && i.toInt == 0) 0
else 1
/* Hacker's Delight, Section 2-7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Hacker's Delight, Section 2-7
/* Hacker's Delight, Section 2-8

else throw new ArithmeticException("Long overflow")
if (((a ^ b) & (res ^ a)) < 0L)
longOverflow()
res
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand how the overflow checks for addition / subtraction up to here in the file relate to Hacker's Delight. But I understand that they essentially do the same as before, just with bitwise operations. (so no additional comments required IMO, but maybe remove the hacker's delight comments?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I elaborated the comments to point more specifically to the formulas we use. But you're right, it's basically applying De Morgan to the negation of our previous formulas.

@inline
def toIntExact(a: scala.Long): scala.Int = {
val res = a.toInt
if (res.toLong != a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment why this is better than checking against min / max value?

I realize the previous code had branches, but it seems they could easily be avoided (or do we not have a non short circuiting boolean and?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment. The long comparisons require 2 int comparisons each. With the new code, we only need 1 int comparison.

class MathTestOnJDK11 {

@noinline
private def hideFromOptimizer(x: Int): Int = x

@Test def multiplyExactLongInt(): Unit = {
for (n <- Seq(Long.MinValue, -1L, 0L, 1L, Long.MaxValue)) {
val nInt =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, duplicate the loop and have the lhs / rhs cases separate?

As well as some related improvements.

While we're there, we also normalize the shape of overflow checks
in `Math.xExact` methods.
@sjrd sjrd force-pushed the hackers-delight-branchless-magic branch from 0d0b69f to b11b0ec Compare July 28, 2025 10:07
Copy link
Member Author

@sjrd sjrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased to address a benign conflict. Otherwise I only changed comments (and the test) to address the review.

@@ -765,6 +765,10 @@ object RuntimeLong {

@inline
def clz(a: RuntimeLong): Int = {
/* Warning to the next adventurer to come here: the best branchless
* algorithm I found was worse than the naive implementation here.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance-wise. I added that to the comment.

@inline
def toIntExact(a: scala.Long): scala.Int = {
val res = a.toInt
if (res.toLong != a)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment. The long comparisons require 2 int comparisons each. With the new code, we only need 1 int comparison.

else throw new ArithmeticException("Long overflow")
if (((a ^ b) & (res ^ a)) < 0L)
longOverflow()
res
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I elaborated the comments to point more specifically to the formulas we use. But you're right, it's basically applying De Morgan to the negation of our previous formulas.

@sjrd sjrd enabled auto-merge July 28, 2025 10:22
@sjrd sjrd merged commit 07ae33a into scala-js:main Jul 28, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants