-
Notifications
You must be signed in to change notification settings - Fork 570
compiler/natives/src/reflect: Optimize Swapper by swapping JS array elements directly (bypass reflection). #785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
db946f2
to
55e1502
Compare
I was thinking of adding the following native override to sort as well, mind if I add it to your PR? // +build js
package sort
import "github.com/gopherjs/gopherjs/js"
// Strings sorts a slice of strings in increasing order.
func Strings(a []string) {
js.InternalObject(a).Get("$array").Call("sort")
} |
Nevermind, that breaks a bunch of non-sort tests. I'll look into it and make another PR later. |
Looks 👍to me @lologarithm! @bign8 - I don't think that will work because in Go world strings are UTF-8 and in Javascript world they are UTF-16. But string sorting will benefit generally from this change by the looks of things. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me :-)
@shurcooL wdyt?
One thing about this bothers me though. The 'slice.go' file in the stdlib imports reflect. With these changes the reflect import is no longer needed. However, because the import is still in the AST during the compile, we transpile the reflect package as well. This adds ~150kb of extra code that isn't needed. I wonder if there is a way to determine if any dependencies are no longer needed after applying the natives. Maybe search the package AST for any remaining references to each import in the stdlib? Might increase compilation time a little but could result in reduced "binary size". Any thoughts? |
Hi @lologarithm, Thanks a lot for sending this PR. This is an incredible performance improvement! We'll definitely want to merge this. But there are some minor issues and opportunities to improve that I'd like to share. There is one issue of correctness. The implementation, as is, breaks the
That currently doesn't happen. This program did not panic in my testing: package main
import "sort"
func main() {
sort.Slice("hello", nil)
println("still here")
} I think we can make this PR even better (and it resolves the point you've mentioned above about The absolute majority of the performance improvement comes from the improved implementation of We can factor out the diff --git a/compiler/natives/src/reflect/swapper.go b/compiler/natives/src/reflect/swapper.go
index b163fab..4d4af03 100644
--- a/compiler/natives/src/reflect/swapper.go
+++ b/compiler/natives/src/reflect/swapper.go
@@ -2,6 +2,8 @@
package reflect
+import "github.com/gopherjs/gopherjs/js"
+
func Swapper(slice interface{}) func(i, j int) {
v := ValueOf(slice)
if v.Kind() != Slice {
@@ -18,12 +20,10 @@ func Swapper(slice interface{}) func(i, j int) {
}
}
}
- tmp := New(v.Type().Elem()).Elem()
+ s := js.InternalObject(slice).Get("$array")
return func(i, j int) {
- v1 := v.Index(i)
- v2 := v.Index(j)
- tmp.Set(v1)
- v1.Set(v2)
- v2.Set(tmp)
+ tmp := s.Index(i)
+ s.SetIndex(i, s.Index(j))
+ s.SetIndex(j, tmp)
}
} That way, Here's the benchmark comparison of current vs this PR as is:
Here's the benchmark comparison of current vs my proposed change:
@lologarithm What do you think? |
I like it! Should I open a new PR or just pull those changes in here and remove the changes to slice.go? |
It'd be fine to modify this existing PR. Just push as new commits (it'll all be squashed when merging) so the PR history is preserved. I've realized that we'll likely need to add a bound check to the swapper func, similar to one in Go stdlib: if uint(i) >= uint(s.Len) || uint(j) >= uint(s.Len) {
panic("reflect: slice index out of range")
} Lemme see how much it affects performance... |
You have to be careful not to call
But after factoring out that call (the passed slice length doesn't change), it's not expensive at all:
This was the final version I tested: // +build js
package reflect
import "github.com/gopherjs/gopherjs/js"
func Swapper(slice interface{}) func(i, j int) {
v := ValueOf(slice)
if v.Kind() != Slice {
panic(&ValueError{Method: "Swapper", Kind: v.Kind()})
}
// Fast path for slices of size 0 and 1. Nothing to swap.
vLen := uint(v.Len())
switch vLen {
case 0:
return func(i, j int) { panic("reflect: slice index out of range") }
case 1:
return func(i, j int) {
if i != 0 || j != 0 {
panic("reflect: slice index out of range")
}
}
}
s := js.InternalObject(slice).Get("$array")
return func(i, j int) {
if uint(i) >= vLen || uint(j) >= vLen {
panic("reflect: slice index out of range")
}
tmp := s.Index(i)
s.SetIndex(i, s.Index(j))
s.SetIndex(j, tmp)
}
} |
I benched around 1-5% slowdown which makes this still quite a net positive |
here is what i was benchmarking: // +build js
package reflect
import "github.com/gopherjs/gopherjs/js"
func Swapper(slice interface{}) func(i, j int) {
v := ValueOf(slice)
if v.Kind() != Slice {
panic(&ValueError{Method: "Swapper", Kind: v.Kind()})
}
// Fast path for slices of size 0 and 1. Nothing to swap.
switch v.Len() {
case 0:
return func(i, j int) { panic("reflect: slice index out of range") }
case 1:
return func(i, j int) {
if i != 0 || j != 0 {
panic("reflect: slice index out of range")
}
}
}
s := js.InternalObject(slice).Get("$array")
l := uint(js.InternalObject(slice).Get("$length").Int())
return func(i, j int) {
if uint(i) >= l || uint(j) >= l {
panic("reflect: slice index out of range")
}
tmp := s.Index(i)
s.SetIndex(i, s.Index(j))
s.SetIndex(j, tmp)
}
} |
I think yours is better -- one less call to size. ill update EDIT: On another thought -- is the reflect "Len" function better than directly accessing the internals here? It seems slower as i am trying to bench (JS is so finicky with performance benchmarking) EDIT EDIT: Ok, nevermind, its about the same and .Len() is probably better overall here. I will use that. |
b6e556e
to
ffbd327
Compare
rebased to squash all the commits down |
I wish you hadn't, it'd be nice to be able to access the history of the PR (unfortunately, unlike Gerrit, GitHub PRs don't track anything past force pushes). But it's okay. |
ohh, thats what you meant by squash during merge. well, we didn't lose much other than adding and deleting the slice.go file (I am used to a lot of open source projects that ask you to squash to single commit) |
Anything else you want me to add to this? (also, the history of this convo on github will still be around if people want context for the changes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thanks a lot again!
@hajimehoshi Can you PTAL, since the code has changed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, simpler fix with the almost same effect. lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a bug.
A slice is a "view" onto a backing array. It has an offset and a length. We're correctly taking the slice length into account, but not the offset. The following program produces incorrect result after this patch:
package main
import (
"fmt"
"log"
"reflect"
"sort"
)
func main() {
a := [...]int{5, 4, 3, 2, 1}
s := a[1:4]
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
fmt.Println(a)
if !reflect.DeepEqual(a, [...]int{5, 2, 3, 4, 1}) {
log.Fatalln("not equal")
}
}
$ go run main.go
[5 2 3 4 1]
$ gopherjs run main.go
[4 3 5 2 1]
2018/04/05 14:58:41 not equal
The fix should be fairly simple. We just take the slice's diff --git a/compiler/natives/src/reflect/swapper.go b/compiler/natives/src/reflect/swapper.go
index 1066032..a94f796 100644
--- a/compiler/natives/src/reflect/swapper.go
+++ b/compiler/natives/src/reflect/swapper.go
@@ -21,13 +21,16 @@ func Swapper(slice interface{}) func(i, j int) {
}
}
}
- s := js.InternalObject(slice).Get("$array")
+ a := js.InternalObject(slice).Get("$array")
+ off := js.InternalObject(slice).Get("$offset").Int()
return func(i, j int) {
if uint(i) >= vLen || uint(j) >= vLen {
panic("reflect: slice index out of range")
}
- tmp := s.Index(i)
- s.SetIndex(i, s.Index(j))
- s.SetIndex(j, tmp)
+ i += off
+ j += off
+ tmp := a.Index(i)
+ a.SetIndex(i, a.Index(j))
+ a.SetIndex(j, tmp)
}
} Performance is unchanged. When fixing this, @lologarithm, can you please add the program above as a test? You can put it in |
@@ -18,12 +21,13 @@ func Swapper(slice interface{}) func(i, j int) { | |||
} | |||
} | |||
} | |||
tmp := New(v.Type().Elem()).Elem() | |||
s := js.InternalObject(slice).Get("$array") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also rename s
to a
, since it's an array and not a slice.
I added a new test file but im not totally clear on how those tests are run. What I have now passes the test when running Do I need to change the run.go at all? Or are the other tests just run with 'gopherjs test'? And thanks for updating the title to reflect the changed code! |
It already gets tested in CI. See this line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don’t mind, minor test simplification suggestion.
tests/sort_test.go
Outdated
a := [...]int{5, 4, 3, 2, 1} | ||
s := a[1:4] | ||
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] }) | ||
if !reflect.DeepEqual(a, [...]int{5, 2, 3, 4, 1}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are arrays, we can actually compare them with ==
rather than reflect.DeepEqual
. It’s simpler.
if a != [...]int{5, 2, 3, 4, 1} {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. fixing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you again.
tests/sort_test.go
Outdated
s := a[1:4] | ||
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] }) | ||
if a != [...]int{5, 2, 3, 4, 1} { | ||
t.Fatalf("not equal") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Fatalf
can be replaced with just t.Fatal
, since there's no formatting being used here.
tests/sort_test.go
Outdated
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] }) | ||
if a != [...]int{5, 2, 3, 4, 1} { | ||
t.Fatalf("not equal") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding more test cases when sorting a[:]
, a[0:0]
and specifying cap (e.g. a[1:4:5]
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Package sort
tests (in stdlib) already cover the a[:]
scenario, see here.
a[1:4:5]
is equivalent to a[1:4]
. Did you mean a[1:4:4]
?
Added two cases -- @hajimehoshi I used 'a[1:4:4]' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Formatting isn't being used, so t.Fatal suffices. Add period at end of sentences for consistency.
Oops sorry I forgot that... Thanks for fixing shurcooL |
No problem. Thanks again, this was an incredible speedup! |
Nice change @lologarithm Thanks for reviewing @hajimehoshi and @shurcooL |
Sort package was very slow and reflect when transpiled is bloated.
Before natives were added
After natives were added
Result is around a one order of magnitude faster (and smaller generated code when using sort!)