Skip to content

compiler/natives/src/reflect: Optimize Swapper by swapping JS array elements directly (bypass reflection). #785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 6, 2018

Conversation

lologarithm
Copy link
Contributor

Sort package was very slow and reflect when transpiled is bloated.

Before natives were added

gopherjs test sort --run=Slice --bench=Slice
goos: linux
goarch: js
BenchmarkSortString1K_Slice 	     100	  14450000 ns/op
BenchmarkStableInt1K_Slice  	      20	  55000000 ns/op
BenchmarkSortInt64K_Slice   	       1	1486000000 ns/op
PASS
ok  	sort	4.913s

After natives were added

gopherjs test sort --run=Slice --bench=Slice -v
goos: linux
goarch: js
BenchmarkSortString1K_Slice 	    2000	    984000 ns/op
BenchmarkStableInt1K_Slice  	    2000	    993500 ns/op
BenchmarkSortInt64K_Slice   	      20	  84300000 ns/op
PASS
ok  	sort	6.442s

Result is around a one order of magnitude faster (and smaller generated code when using sort!)

@bign8
Copy link

bign8 commented Apr 2, 2018

I was thinking of adding the following native override to sort as well, mind if I add it to your PR?

// +build js

package sort

import "github.com/gopherjs/gopherjs/js"

// Strings sorts a slice of strings in increasing order.
func Strings(a []string) {
	js.InternalObject(a).Get("$array").Call("sort")
}

@bign8
Copy link

bign8 commented Apr 3, 2018

Nevermind, that breaks a bunch of non-sort tests. I'll look into it and make another PR later.

@myitcv
Copy link
Member

myitcv commented Apr 3, 2018

Looks 👍to me @lologarithm!

@bign8 - I don't think that will work because in Go world strings are UTF-8 and in Javascript world they are UTF-16. But string sorting will benefit generally from this change by the looks of things.

Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :-)

@shurcooL wdyt?

@lologarithm
Copy link
Contributor Author

One thing about this bothers me though. The 'slice.go' file in the stdlib imports reflect. With these changes the reflect import is no longer needed. However, because the import is still in the AST during the compile, we transpile the reflect package as well. This adds ~150kb of extra code that isn't needed.

I wonder if there is a way to determine if any dependencies are no longer needed after applying the natives. Maybe search the package AST for any remaining references to each import in the stdlib? Might increase compilation time a little but could result in reduced "binary size".

Any thoughts?

@dmitshur
Copy link
Member

dmitshur commented Apr 4, 2018

Hi @lologarithm,

Thanks a lot for sending this PR. This is an incredible performance improvement!

We'll definitely want to merge this. But there are some minor issues and opportunities to improve that I'd like to share.

There is one issue of correctness. The implementation, as is, breaks the sort.Slice contract. Specifically, the documentation of sort.Slice promises that:

The function panics if the provided interface is not a slice.

That currently doesn't happen. This program did not panic in my testing:

package main

import "sort"

func main() {
	sort.Slice("hello", nil)
	println("still here")
}

I think we can make this PR even better (and it resolves the point you've mentioned above about reflect no longer being needed). Here's how.

The absolute majority of the performance improvement comes from the improved implementation of reflect.Swapper, one that bypasses the overhead of GopherJS reflect machinery and swaps the underlying slice elements directly with JavaScript. The rest of the performance improvement (e.g., getting the slice length via Get("$length").Int()) is comparatively negligible.

We can factor out the reflect.Swapper improvement into reflect package itself:

diff --git a/compiler/natives/src/reflect/swapper.go b/compiler/natives/src/reflect/swapper.go
index b163fab..4d4af03 100644
--- a/compiler/natives/src/reflect/swapper.go
+++ b/compiler/natives/src/reflect/swapper.go
@@ -2,6 +2,8 @@
 
 package reflect
 
+import "github.com/gopherjs/gopherjs/js"
+
 func Swapper(slice interface{}) func(i, j int) {
 	v := ValueOf(slice)
 	if v.Kind() != Slice {
@@ -18,12 +20,10 @@ func Swapper(slice interface{}) func(i, j int) {
 			}
 		}
 	}
-	tmp := New(v.Type().Elem()).Elem()
+	s := js.InternalObject(slice).Get("$array")
 	return func(i, j int) {
-		v1 := v.Index(i)
-		v2 := v.Index(j)
-		tmp.Set(v1)
-		v1.Set(v2)
-		v2.Set(tmp)
+		tmp := s.Index(i)
+		s.SetIndex(i, s.Index(j))
+		s.SetIndex(j, tmp)
 	}
 }

That way, sort doesn't have to be modified at all, and it will correctly panic on non-slice input. It also benefits everything that uses reflect.Swapper, not just those 3 functions in sort. I've benchmarked the result, and it's equally fast.

Here's the benchmark comparison of current vs this PR as is:

benchmark                       old ns/op      new ns/op     delta
BenchmarkSortString1K_Slice     17310000       510000        -97.05%
BenchmarkStableInt1K_Slice      84900000       793000        -99.07%
BenchmarkSortInt64K_Slice       2241000000     37200000      -98.34%

Here's the benchmark comparison of current vs my proposed change:

benchmark                       old ns/op      new ns/op     delta
BenchmarkSortString1K_Slice     17310000       531000        -96.93%
BenchmarkStableInt1K_Slice      84900000       788000        -99.07%
BenchmarkSortInt64K_Slice       2241000000     36566666      -98.37%

@lologarithm What do you think?

@lologarithm
Copy link
Contributor Author

I like it! Should I open a new PR or just pull those changes in here and remove the changes to slice.go?

@dmitshur
Copy link
Member

dmitshur commented Apr 4, 2018

It'd be fine to modify this existing PR. Just push as new commits (it'll all be squashed when merging) so the PR history is preserved.

I've realized that we'll likely need to add a bound check to the swapper func, similar to one in Go stdlib:

if uint(i) >= uint(s.Len) || uint(j) >= uint(s.Len) {
	panic("reflect: slice index out of range")
}

Lemme see how much it affects performance...

@dmitshur
Copy link
Member

dmitshur commented Apr 4, 2018

You have to be careful not to call v.Len() inside the swapper func, because that's expensive:

$ benchcmp before.txt after.txt 
benchmark                       old ns/op     new ns/op     delta
BenchmarkSortString1K_Slice     528000        3826666       +624.75%
BenchmarkStableInt1K_Slice      823000        12430000      +1410.33%
BenchmarkSortInt64K_Slice       37300000      335000000     +798.12%

But after factoring out that call (the passed slice length doesn't change), it's not expensive at all:

$ benchcmp before.txt after2.txt 
benchmark                       old ns/op     new ns/op     delta
BenchmarkSortString1K_Slice     528000        535500        +1.42%
BenchmarkStableInt1K_Slice      823000        812500        -1.28%
BenchmarkSortInt64K_Slice       37300000      37400000      +0.27%

This was the final version I tested:

// +build js

package reflect

import "github.com/gopherjs/gopherjs/js"

func Swapper(slice interface{}) func(i, j int) {
	v := ValueOf(slice)
	if v.Kind() != Slice {
		panic(&ValueError{Method: "Swapper", Kind: v.Kind()})
	}
	// Fast path for slices of size 0 and 1. Nothing to swap.
	vLen := uint(v.Len())
	switch vLen {
	case 0:
		return func(i, j int) { panic("reflect: slice index out of range") }
	case 1:
		return func(i, j int) {
			if i != 0 || j != 0 {
				panic("reflect: slice index out of range")
			}
		}
	}
	s := js.InternalObject(slice).Get("$array")
	return func(i, j int) {
		if uint(i) >= vLen || uint(j) >= vLen {
			panic("reflect: slice index out of range")
		}
		tmp := s.Index(i)
		s.SetIndex(i, s.Index(j))
		s.SetIndex(j, tmp)
	}
}

@lologarithm
Copy link
Contributor Author

I benched around 1-5% slowdown which makes this still quite a net positive

@lologarithm
Copy link
Contributor Author

lologarithm commented Apr 4, 2018

here is what i was benchmarking:

// +build js

package reflect

import "github.com/gopherjs/gopherjs/js"

func Swapper(slice interface{}) func(i, j int) {
	v := ValueOf(slice)
	if v.Kind() != Slice {
		panic(&ValueError{Method: "Swapper", Kind: v.Kind()})
	}
	// Fast path for slices of size 0 and 1. Nothing to swap.
	switch v.Len() {
	case 0:
		return func(i, j int) { panic("reflect: slice index out of range") }
	case 1:
		return func(i, j int) {
			if i != 0 || j != 0 {
				panic("reflect: slice index out of range")
			}
		}
	}
	s := js.InternalObject(slice).Get("$array")
	l := uint(js.InternalObject(slice).Get("$length").Int())
	return func(i, j int) {
		if uint(i) >= l || uint(j) >= l {
			panic("reflect: slice index out of range")
		}
		tmp := s.Index(i)
		s.SetIndex(i, s.Index(j))
		s.SetIndex(j, tmp)
	}
}

@lologarithm
Copy link
Contributor Author

lologarithm commented Apr 4, 2018

I think yours is better -- one less call to size. ill update

EDIT: On another thought -- is the reflect "Len" function better than directly accessing the internals here? It seems slower as i am trying to bench (JS is so finicky with performance benchmarking)

EDIT EDIT: Ok, nevermind, its about the same and .Len() is probably better overall here. I will use that.

@lologarithm
Copy link
Contributor Author

rebased to squash all the commits down

@dmitshur
Copy link
Member

dmitshur commented Apr 4, 2018

rebased to squash all the commits down

I wish you hadn't, it'd be nice to be able to access the history of the PR (unfortunately, unlike Gerrit, GitHub PRs don't track anything past force pushes). But it's okay.

@lologarithm
Copy link
Contributor Author

lologarithm commented Apr 4, 2018

ohh, thats what you meant by squash during merge. well, we didn't lose much other than adding and deleting the slice.go file

(I am used to a lot of open source projects that ask you to squash to single commit)

@lologarithm
Copy link
Contributor Author

Anything else you want me to add to this? (also, the history of this convo on github will still be around if people want context for the changes)

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Thanks a lot again!

@dmitshur
Copy link
Member

dmitshur commented Apr 4, 2018

@hajimehoshi Can you PTAL, since the code has changed?

Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, simpler fix with the almost same effect. lgtm

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a bug.

A slice is a "view" onto a backing array. It has an offset and a length. We're correctly taking the slice length into account, but not the offset. The following program produces incorrect result after this patch:

package main

import (
	"fmt"
	"log"
	"reflect"
	"sort"
)

func main() {
	a := [...]int{5, 4, 3, 2, 1}
	s := a[1:4]
	sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
	fmt.Println(a)
	if !reflect.DeepEqual(a, [...]int{5, 2, 3, 4, 1}) {
		log.Fatalln("not equal")
	}
}
$ go run main.go 
[5 2 3 4 1]
$ gopherjs run main.go 
[4 3 5 2 1]
2018/04/05 14:58:41 not equal

@dmitshur
Copy link
Member

dmitshur commented Apr 5, 2018

The fix should be fairly simple. We just take the slice's $offset value into account:

diff --git a/compiler/natives/src/reflect/swapper.go b/compiler/natives/src/reflect/swapper.go
index 1066032..a94f796 100644
--- a/compiler/natives/src/reflect/swapper.go
+++ b/compiler/natives/src/reflect/swapper.go
@@ -21,13 +21,16 @@ func Swapper(slice interface{}) func(i, j int) {
 			}
 		}
 	}
-	s := js.InternalObject(slice).Get("$array")
+	a := js.InternalObject(slice).Get("$array")
+	off := js.InternalObject(slice).Get("$offset").Int()
 	return func(i, j int) {
 		if uint(i) >= vLen || uint(j) >= vLen {
 			panic("reflect: slice index out of range")
 		}
-		tmp := s.Index(i)
-		s.SetIndex(i, s.Index(j))
-		s.SetIndex(j, tmp)
+		i += off
+		j += off
+		tmp := a.Index(i)
+		a.SetIndex(i, a.Index(j))
+		a.SetIndex(j, tmp)
 	}
 }

Performance is unchanged.

When fixing this, @lologarithm, can you please add the program above as a test? You can put it in github.com/gopherjs/gopherjs/tests package.

@dmitshur dmitshur requested a review from hajimehoshi April 5, 2018 19:22
@@ -18,12 +21,13 @@ func Swapper(slice interface{}) func(i, j int) {
}
}
}
tmp := New(v.Type().Elem()).Elem()
s := js.InternalObject(slice).Get("$array")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also rename s to a, since it's an array and not a slice.

@dmitshur dmitshur changed the title Added sort package natives compiler/natives/src/reflect: Optimize Swapper by swapping JS array elements directly (bypass reflection). Apr 5, 2018
@lologarithm
Copy link
Contributor Author

lologarithm commented Apr 6, 2018

I added a new test file but im not totally clear on how those tests are run. What I have now passes the test when running gopherjs test --run=TestSort

Do I need to change the run.go at all? Or are the other tests just run with 'gopherjs test'?

And thanks for updating the title to reflect the changed code!

@dmitshur
Copy link
Member

dmitshur commented Apr 6, 2018

github.com/gopherjs/gopherjs/tests is a normal Go package. You can run its tests with GopherJS:

gopherjs test github.com/gopherjs/gopherjs/tests

It already gets tested in CI. See this line.

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don’t mind, minor test simplification suggestion.

a := [...]int{5, 4, 3, 2, 1}
s := a[1:4]
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
if !reflect.DeepEqual(a, [...]int{5, 2, 3, 4, 1}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are arrays, we can actually compare them with == rather than reflect.DeepEqual. It’s simpler.

if a != [...]int{5, 2, 3, 4, 1} {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. fixing

Copy link
Member

@dmitshur dmitshur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you again.

s := a[1:4]
sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
if a != [...]int{5, 2, 3, 4, 1} {
t.Fatalf("not equal")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Fatalf can be replaced with just t.Fatal, since there's no formatting being used here.

sort.Slice(s, func(i, j int) bool { return s[i] < s[j] })
if a != [...]int{5, 2, 3, 4, 1} {
t.Fatalf("not equal")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding more test cases when sorting a[:], a[0:0] and specifying cap (e.g. a[1:4:5])?

Copy link
Member

@dmitshur dmitshur Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package sort tests (in stdlib) already cover the a[:] scenario, see here.

a[1:4:5] is equivalent to a[1:4]. Did you mean a[1:4:4]?

@lologarithm
Copy link
Contributor Author

Added two cases -- @hajimehoshi I used 'a[1:4:4]'

Copy link
Member

@hajimehoshi hajimehoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Formatting isn't being used, so t.Fatal suffices.

Add period at end of sentences for consistency.
@lologarithm
Copy link
Contributor Author

Oops sorry I forgot that... Thanks for fixing shurcooL

@dmitshur dmitshur merged commit 827bd48 into gopherjs:master Apr 6, 2018
@dmitshur
Copy link
Member

dmitshur commented Apr 6, 2018

No problem. Thanks again, this was an incredible speedup!

@lologarithm lologarithm deleted the sort_natives branch April 6, 2018 18:36
@myitcv
Copy link
Member

myitcv commented Apr 7, 2018

Nice change @lologarithm

Thanks for reviewing @hajimehoshi and @shurcooL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants