Optimize str_casecmp
length check using pointer end
#14163
+28
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change refactors the final length check in
str_casecmp
to use existing pointers instead of recalculating the string lengths.Currently, after the comparison loop finishes,
str_casecmp
usesRSTRING_LEN
to check the lengths ofstr1
andstr2
. This requires an additional calculation.This PR replaces the
RSTRING_LEN
calls with a check against thep1
andp2
pointers and their respective end pointers (p1end
andp2end
). Since these pointers are already advanced during the comparison loop, this approach avoids redundant length calculations and slightly improves performance.The new logic is:
p1 == p1end && p2 == p2end
), the strings are equal in length, returning0
.p1
has reached its end butp2
has not (p1 == p1end
),str1
is shorter, returning-1
.p2
has reached its end butp1
has not, sostr2
is shorter, returning1
.Expecting a 3-5% performance increase for
str_casecmp
.As I did run benchmark 5 times and took average on my M1 apple;
Benchmarks against master branch:
casecmp-1
casecmp-10
casecmp-100
casecmp-1000
casecmp-1000vs10
casecmp-nonascii1
casecmp-nonascii10
casecmp-nonascii100
casecmp-nonascii1000
casecmp-nonascii1000vs10
Benchmarks against current branch:
casecmp-1
casecmp-10
casecmp-100
casecmp-1000
casecmp-1000vs10
casecmp-nonascii1
casecmp-nonascii10
casecmp-nonascii100
casecmp-nonascii1000
casecmp-nonascii1000vs10