Skip to content

WiP Optimize RuntimeLong division and remainder. #5190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Jun 4, 2025

TODO: Check the case where the divisor is >= 2^62.
TODO: Write down the proof.

sjrd added 3 commits June 3, 2025 09:47
Hacker's Delight offers branchless formulas for double-word
additions and subtraction, without access to the machine's carry
bit.

Although the new formula contains more elementary instructions,
removal of the branch is significant. When one operand is
constant, folding reduces to 3 bitwise operations to compute the
carry, which is very fast. When both operands are variable, then
the carry is often 50/50 unpredictable, which means the branch
is unpredictable, and removing it is worth the full 5 bitwise
operations anyway.

Negation does not need a special code path anymore. Regular
folding of the `0L - b` formula yields optimal, branchless code
anyway. We remove `RuntimeLong.neg` and its code paths in the
optimizer and emitter. While we're there, we also remove
`RuntimeLong.not`, since the regular code paths for `-1L ^ b`
fold in the same way.

This change reduces execution time of the `sha512` benchmark by a
whopping 25-30%.
TODO: Check the case where the divisor is >= 2^62.
TODO: Write down the proof.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant