SlideShare a Scribd company logo
Why my Go 
program is slow? 
GoCon 2014 autumn, INADA Naoki
INADA Naoki 
• @methane 
• KLab Inc. 
• Python 
• mysqlclient, PyMySQL, msgpack
And I’m gopher 
• I love Go. It’s more Pythonic than Python! 
• Explicit is better than implicit. 
• Simple is better than complex. 
• Readability counts. 
• There should be one-- and preferably only 
one --obvious way to do it. 
• Although that way may not be obvious at 
first unless you're Dutch (Gopher).
Today 
• CPU profiling with pprof 
• Random notes about Go’s performance. 
• What I hope Go 1.5~ will have
CPU profiling with 
pprof
runtime/pprof 
• Sampling profiler. 
• net/http/pprof provides HTTP API for 
pprof 
• github.com/davecheney/profile provice 
easy API for CLI program.
NOTE for Mac 
• Go’s CPU profiling is not work on Mac 
• Russ Cox (rsc) explains why: 
http://research.swtch.com/macpprof 
• And he provides kernel patch: 
http://godoc.org/code.google.com/p/rsc/ 
cmd/pprof_mac_fix 
• Activity monitor or Instruments provides 
another sampling profiler
pprof command 
• Included in google perftools 
• ~Go 1.3 bundles CLI (perl) 
• Go 1.4 has reimplement it with Go!
Embed net/http/pprof 
import ( 
“net/http” 
_ “net/http/pprof” 
) 
func main() { 
go http.ListenAndServe(“:5000”, nil) 
}
Let’s use it 
• top 
• web (svg) 
• list, disasm, and weblist
Random notes 
about Go’s performance
Things makes Go slower 
• GC 
• memcpy 
• function call
GC 
• GODEBUG=gctrace=1 
• heap profile with pprof 
• sync.Pool 
• Provide size hint for slice and map. 
• make([]int, 0, hint) 
• make(map[int]int, hint) 
• GOGC=400
memcpy 
• Choose carefully string or []byte 
• Especially for constants
function call 
• Go’s calling convention is much slower than C. 
• No pass by register 
• All register are volatile 
• hook for runtime. 
• But Go has inlining 
• small leaf function may be inlined 
• Not inlined func may not have hook, too.
compare 
• calls1.go 
http://play.golang.org/p/N8oz-eyFII 
• calls2.go 
http://play.golang.org/p/s_Uv0vWirZ 
13022001ns vs 10698225ns
What I hope Go 1.5~ will 
have 
• Concurrent GC will be come in Go 1.5! 
• But more speed is welcome anytime 
• fastcall for leaf function? 
• pass by register 
• volatile register

More Related Content

Why my Go program is slow?

  • 1. Why my Go program is slow? GoCon 2014 autumn, INADA Naoki
  • 2. INADA Naoki • @methane • KLab Inc. • Python • mysqlclient, PyMySQL, msgpack
  • 3. And I’m gopher • I love Go. It’s more Pythonic than Python! • Explicit is better than implicit. • Simple is better than complex. • Readability counts. • There should be one-- and preferably only one --obvious way to do it. • Although that way may not be obvious at first unless you're Dutch (Gopher).
  • 4. Today • CPU profiling with pprof • Random notes about Go’s performance. • What I hope Go 1.5~ will have
  • 6. runtime/pprof • Sampling profiler. • net/http/pprof provides HTTP API for pprof • github.com/davecheney/profile provice easy API for CLI program.
  • 7. NOTE for Mac • Go’s CPU profiling is not work on Mac • Russ Cox (rsc) explains why: http://research.swtch.com/macpprof • And he provides kernel patch: http://godoc.org/code.google.com/p/rsc/ cmd/pprof_mac_fix • Activity monitor or Instruments provides another sampling profiler
  • 8. pprof command • Included in google perftools • ~Go 1.3 bundles CLI (perl) • Go 1.4 has reimplement it with Go!
  • 9. Embed net/http/pprof import ( “net/http” _ “net/http/pprof” ) func main() { go http.ListenAndServe(“:5000”, nil) }
  • 10. Let’s use it • top • web (svg) • list, disasm, and weblist
  • 11. Random notes about Go’s performance
  • 12. Things makes Go slower • GC • memcpy • function call
  • 13. GC • GODEBUG=gctrace=1 • heap profile with pprof • sync.Pool • Provide size hint for slice and map. • make([]int, 0, hint) • make(map[int]int, hint) • GOGC=400
  • 14. memcpy • Choose carefully string or []byte • Especially for constants
  • 15. function call • Go’s calling convention is much slower than C. • No pass by register • All register are volatile • hook for runtime. • But Go has inlining • small leaf function may be inlined • Not inlined func may not have hook, too.
  • 16. compare • calls1.go http://play.golang.org/p/N8oz-eyFII • calls2.go http://play.golang.org/p/s_Uv0vWirZ 13022001ns vs 10698225ns
  • 17. What I hope Go 1.5~ will have • Concurrent GC will be come in Go 1.5! • But more speed is welcome anytime • fastcall for leaf function? • pass by register • volatile register