Skip to content

proxy.golang.org: single intermittent failure of vanity URL causes stale cache #49916

@myitcv

Description

@myitcv

What version of Go are you using (go version)?

$ go version
go version devel go1.18-d34051bf16 Thu Dec 2 07:04:05 2021 +0000 linux/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/myitcv/gostuff/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="devel go1.18-d34051bf16 Thu Dec 2 07:04:05 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3900600216=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Related to #34370.

Around 2021/12/01 1754 UTC we cut a new release of CUE, v0.4.1-beta.4.

At the time that could not be resolved via sum.golang.org (https://sum.golang.org/lookup/cuelang.org/go@v0.4.1-beta.4) because of:

not found: cuelang.org/go@v0.4.1-beta.4: unrecognized import path "cuelang.org/go": https fetch: Get "https://cuelang.org/go?go-get=1": dial tcp 68.183.23.220:443: connect: connection refused

i.e. it appears a transient error resolving https://cuelang.org/go?go-get=1 caused sum.golang.org to cache an error.

However, sum.golang.org was not consistently returning this error (by IP):

# 172.217.169.49
not found: cuelang.org/go@v0.4.1-beta.4: unrecognized import path "cuelang.org/go": https fetch: Get "https://cuelang.org/go?go-get=1": dial tcp 68.183.23.220:443: connect: connection refused

# 142.250.187.241
8200543
cuelang.org/go v0.4.1-beta.4 h1:OQwpzifUIivXoxQK68EukVKnATP90GeelTSHnOW1qfI=
cuelang.org/go v0.4.1-beta.4/go.mod h1:P09/R4UfAEzLkV9DXxwlxQnIZbkaT4uIhiEgs6Vsz2Q=

go.sum database tree
8211598
6KTVEmHHgorOjo5YAT18Gj8CznzxCPxir+IF+SDAZUo=

— sum.golang.org Az3grmjNeN768KqG22Zo7F4Gp6T4entYI8PeYDSiWCjBsITq7p9CiDkiA62ZLw6aaG97j5rHOTesLqWZGVS836655QU=

What did you expect to see?

  • my sense from GOSUMDB failing "410 Gone" for publicly-available tag #34370 (comment) was that the sum.golang.org servers should now be more consistent in their responses. Is there some way in which the cache could be synchronised/invalidated in the case of error responses when it is known, as was the case here, that other instances of sum.golang.org were not in error? Hence I would either expect no cache discrepancies, or at least a much shorter "bad" cache time period
  • cmd/go (which I believe is what {proxy,sum}.golang.org use behind the scenes?) failing after a single call to ?go-get=1 also seems a little unfortunate. In this case the effect was exacerbated by the subsequent caching issue, but I wonder if there is some sort of retry logic that should be applied here in the case of connection refused? Not least because the SLA we can generally expect from vanity servers is unlikely to match that of, say, {proxy,sum}.golang.org. Hence, I would have expected us not to get in to this situation if a simply retry on connection refused was attempted.

What did you see instead?

A "long" bad cache on sum.golang.org which then prevented a successful release.

cc @bcmills for cmd/go and @katiehockman for {proxy,sum}.golang.org

Thanks to @mvdan and @seankhliao for helping to debug here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.proxy.golang.org

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions