Skip to content

vfsgendev updates modification times #804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
myitcv opened this issue Apr 22, 2018 · 8 comments
Open

vfsgendev updates modification times #804

myitcv opened this issue Apr 22, 2018 · 8 comments

Comments

@myitcv
Copy link
Member

myitcv commented Apr 22, 2018

@shurcooL I just got caught out using vfsgendev and I wonder whether you can help advise how best to fix this.

I forgot that post #787 a go generate is required after any changes to github.com/gopherjs/gopherjs/js. Indeed the CI build didn't give me any clues either; my test just failed for an innocuous reason.

After running go generate github.com/gopherjs/gopherjs/... I got a load of modification time changes in the two fs_vfsdata.go files in addition to the expected content change in js/js.go. Am I right in thinking this is because the modification times are taken from the files on disk?

What would seem ideal here would be that we run go generate github.com/gopherjs/gopherjs/... as one of the first steps in CI so that the subsequent diff -u <(echo -n) <(git status --porcelain) can help us catch when we've forgotten to regenerate anything. We are already running a generate step for the prelude for this exact reason.

But I think this requires some changes in vfsgendev or at least the way we are using it in GopherJS. The following repro is another way of demonstrating the issue:

cd `mktemp -d`
export GOPATH=$PWD
go get github.com/shurcooL/vfsgen/cmd/vfsgendev
go get github.com/gopherjs/gopherjs
cd src/github.com/gopherjs/gopherjs/
npm install
go generate ./...
git status --porcelain

gives the output:

 M compiler/gopherjspkg/fs_vfsdata.go
 M compiler/natives/fs_vfsdata.go

What do you think is the best way forward here?

Thanks

@dmitshur
Copy link
Member

dmitshur commented Apr 23, 2018

Yes, this is a known limitation of vfsgen, and the tracking issue for it is shurcooL/vfsgen#26.

Am I right in thinking this is because the modification times are taken from the files on disk?

That's right, they are taken from the files in the src directory on disk.

To be more specific, vfsgen uses the file modify times from the input virtual filesystem, but our natives and gopherjspkg packages use a (filtered) http.Dir, which is a direct wrapper around the OS filesystem (i.e., the "disk").

Unfortunately, git doesn't do a great job of keeping file modify times very consistent, and so as a result, regenerating those two packages can result in the modify times changing even when the file contents haven't.

shurcooL/vfsgen#26 is an issue I would very much like to find a good resolution for, but it's not high on my list of priorities. It's something I think about in the background, but can't afford to spend time looking for a solution more aggressively right now. I'd rather wait until I find a good solution than to try fix it at any cost now.

As a result, until that issue is resolved, it won't be possible to rely on go generate github.com/gopherjs/gopherjs/... being idempotent. (If it were possible, I would've added that line to the CI script a while ago. I'm aware it's not possible for the reasons you've discovered.)

What do you think is the best way forward here?

I can think of two ways at this time:

  1. Do nothing. This is not a critical issue, just a relatively minor inconvenience. From my experience, I found it easy to work around it while maintaining GopherJS the last few years. It just hasn't caused a lot of overhead so far to become a high priority for me. As a result, I worry that the time and focus taken to try to resolve this now may not actually pay off by saving enough time in the long run (see relevant xkcd 1205 and 1319).

    One of the reasons I think this is a viable and possibly optimal option is because, relatively speaking, GopherJS does not have a very high volume of changes to natives at this time. If that changes, tacking this issue would become more valuable and can be prioritized then.

  2. While it may not be viable to rely on go generate github.com/gopherjs/gopherjs/... producing zero diff to ensure that the natives and gopherjspkg packages are generated correctly, there is another way of achieving that goal that I can think of. At a high level:

    1. compute hash of natives.FS and gopherjspkg.FS, taking into
       account everything but file modify times
    2. do `go generate github.com/gopherjs/gopherjs/...`
    3. compute hash of natives.FS and gopherjspkg.FS again, and
       compare it to the hash computed in step 1
    

@myitcv
Copy link
Member Author

myitcv commented Apr 23, 2018

Unfortunately, git doesn't do a great job of keeping file modify times very consistent, and so as a result, regenerating those two packages can result in the modify times changing even when the file contents haven't.

Indeed. But the only thing we need to rely on here is the contents of those files; the modification times are irrelevant aren't they? This is something that the go build cache has moved towards and is why git does not care about file modification times.

One of the reasons I think this is a viable and possibly optimal option is because, relatively speaking, GopherJS does not have a very high volume of changes to natives at this time. If that changes, tacking this issue would become more valuable and can be prioritized then.

There are two PRs that I'm working on where this becomes relevant: #618 and #790. Not critical, just pointing out how I came across this given the change introduced in #787.

there is another way of achieving that goal that I can think of...

Yes, agreed. Whilst it would also break the isolation rule, having vfsgendev only write its output file if it has logically changed (i.e. any of the hashes of the file contents have changed) would also work. As part of this you could also choose to only write the modification times of those files whose hashes have changed. But that wouldn't be critical.

@dmitshur
Copy link
Member

dmitshur commented Apr 23, 2018

But the only thing we need to rely on here is the contents of those files; the modification times are irrelevant aren't they?

Is that the case? I'm not sure, but I suspect they might be used by GopherJS to determine staleness.

This can be investigated, and if they're really not needed, then we can solve this issue trivially by overriding all modtimes to be zero in the VFS.

This is something that the go build cache has moved towards and is why git does not care about file modification times.

The new cmd/go build cache in 1.10 is indeed very nice, but it took a lot of work to get there. GopherJS uses a much simpler and less robust modtime-based system for determining package staleness. I expect it would be a very significant undertaking to get it to cmd/go levels.

@myitcv
Copy link
Member Author

myitcv commented Apr 23, 2018

Is that the case? I'm not sure, but I suspect they might be used by GopherJS to determine staleness.

At a guess, even if it is using modification times to determine staleness, then when reading from the VFS it will always get the same answer (because I don't think we're updating the VFS at runtime and we either read from the VFS or disk). Hence I think we can zero them. But that's just a gut feel.

@dmitshur
Copy link
Member

dmitshur commented Apr 23, 2018

even if it is using modification times to determine staleness, then when reading from the VFS it will always get the same answer

Yes, it will be the same—but non-zero—answer. It would change only when you modify src contents and then regenerate natives. But that is exactly the time a package should be rebuilt, since natives contents would've changed.

@hajimehoshi
Copy link
Member

What's the current status on this? I confirmed that reproduceable build results would make us happy.

@dmitshur
Copy link
Member

dmitshur commented Apr 14, 2019

The comments above accurately reflect the current status. There hasn't been new progress made since then.

@myitcv
Copy link
Member Author

myitcv commented Apr 15, 2019

@hajimehoshi - I'm using a fork that only includes the file contents, not any time-related information: https://github.com/myitcv/gopherjs/blob/edcf4d6ad7d30c37850f3211cc76b9413fc4c128/go.mod#L20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants