[go1.20] Repeat go1.19 strings.Repeat for go1.20 #1324
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Overriding
strings.Repeat
with go1.19's implementation because it is significantly faster than the go1.20 implementation and JS's String.replace gives different results.The CI will still fail for go1.20 but now it will fail faster. This is part of #1270
Full Explanation
Problem with go1.20 repeat
In the go1.20 implementation of
strings.Repeat
the function was changed to use chunks with a limited size because:Unfortunately, that change makes the GopherJS run some tests significantly slower. One run of
encoding/pem TestCVE202224675
took 6224.85s (about 1.7 hours) to finish! That test checks the fix for CVE-2022-24675 which deals with a "stack overflow via a large amount of PEM data". To get that stack overflow, the test starts withthat creates a 120MB string. With the change to go1.20's
strings.Repeat
limiting the chunk size to 8KB, meaning the inner loop will loop about 15,000 times.In the go1.19's implementation of
strings.Repeat
there is no limit so the string is doubled in the loop (exponential growth). This means the inner loop will loop only 24 times. The number of concatenations and loops makes a huge difference in JS, where as the chunk size doesn't. One run ofTestCVE202224675
took 24.68s. That's means the go1.19 implementation is 250 times faster than (takes 0.4% the amount of time as) the go1.20 implementation in GopherJS.Problem with JS repeat
Since we had to roll back the go1.20 implementation of
strings.Repeat
to go1.19 anyway, I thought I'd see if the JSString.repeat
would be faster. I triedThis does appear to be slightly faster, but not significantly faster, than go1.19. One run of
TestCVE202224675
took 19.42s, meaning it is 1.27 times faster than (take 80% the amount of time as) the go1.19 implementation.Unfortunately, the JS's implementation returns slightly different results. A few tests fail using JS's implementation. For example
hash/adler32 TestGolden
fails 8 of it's scenarios, specifically those usingstrings.Repeat
. One failed scenario creates the input withThe resulting string from JS has 11,097 characters (not 5,549 like expected). The bytes in the string are
[195 191 195 191 ... 195 191 195 191 56]
. The repeating phrase195 191
is because JS converts"\xFF"
into the UTF-8 representation of "ÿ" ("\xC3\xBF"). Go treats strings as if they are pre-escaped for UTF-8 and simply repeats the bytes as they are, so the result is 5,549 characters with the byes[255 255 ... 255 255 56]
.I didn't check if there was a way to get JS to repeat without doing the additional UTF-8 escaping. I took this as an indication that we should simply roll back the go1.20 implementation to the go1.19 implementation to start with. Further investigation into a JS implementation can be done later.