Skip to content

fix: Generate shrinkwraps for the bundled vscode #5071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 22, 2022

Conversation

edvincent
Copy link
Contributor

@edvincent edvincent commented Apr 7, 2022

Fixes #5056 #5174

Why are we removing synp after all?

Because it only takes care of putting in the package-lock/shrinkwrap file the first level of dependencies, not the nested dependencies which we have from the yarn lockfile.

See the contents of https://unpkg.com/browse/code-server@4.2.0-5047-90c9adcded65d10004a9db71d0717cd90565baa3/npm-shrinkwrap.json, and how for example, @mapbox/node-pre-gyp is listed as a dependency of argon2 - but is not present anywhere in the shrinkwrap file.

Why are we adding packages from resolutions in devDependencies too?

What initially drove me to use synp was the fact that the npm shrinkwrap command would generate a lockfile that wasn't usable. Trying to install the package with that lockfile would fail with the following exception:

Installing Code dependencies...
[..]
npm ERR! Cannot read property 'requires' of undefined

This was actually due to npm shrinkwrap (in NPM 6) choking with some packages not being anywhere in the lockfile but being present in the package.json.

So I simply tracked down their dependency tree, saw they were coming from devDependencies, and added them there - which now prevents the npm shrinkwrap-generated lockfile from breaking, and allows us to remove synp. This doesn't change the size of the dependency tree, and should be able to be removed when there are higher versions of NPM used.

Why aren't we using npm ci to replace yarn --frozen-lockfile?

NPM 6's npm ci doesn't deal very well with optional dependencies, which the bundled vscode needs for some Windows dependencies (namely, windows-process-tree and @vscode/windows-registry):

  • The npm shrinkwrap command doesn't add them to the lockfile, because the CI runs from Linux, so it doesn't install them - and the commands generates the lockfile based on what's in node_modules.
  • If we hacked the lockfile and added them there manually (with "optional": true), the install on a Linux machine would fail - because NPM 6's npm ci doesn't respect them from an npm-shrinkwrap.json file.
  • If we don't add them and use npm ci, they are not installed - even on Windows machines.

What I settled for was using npm install --production (for now), that will always respect the lockfile version if an entry is there, and will try to get versions not listed there. This effectively means that those two packages, on windows, won't be following the yarn.lock that is being used in the development package, but that seemed less worse.

Note: Another idea was to hackily add the packages to the lockfile + enhance the postinstall.sh script to conditionally use --no-optional if we're running on a non-Windows machine - which skips those optional dependencies. I wasn't convinced by either the hacking around of the lockfile nor this conditional... But happy to hear thoughts?

Why don't we need the "Remove the devDependencies" hack anymore?

Because we are now generating the shrinkwrap file after stuff has been installed with yarn --production, which means npm shrinkwrap will only list the production dependencies.

Why are we removing json after all?
Because we don't need to remove the devDependencies anymore - which needed some more complex JSON mangling. Now we simply need to remove a line, which can be done with sed.

But also, because doing it with json gets a bit tricky, as json would need to be listed as an actual dependency (rather than devDependency) of code-server and vscode too to be able to be used... Not worth it.

Switching the jq commands to json might still be worth it, but no need to conflate the commits.

Why no more yarn pack artifact?
Realized I was trying to fix/conflate too many things into one PR, so will likely create a different issue to discuss this one.

@edvincent edvincent requested a review from a team April 7, 2022 01:56
@codecov
Copy link

codecov bot commented Apr 7, 2022

Codecov Report

Merging #5071 (b1382c8) into main (33ee184) will not change coverage.
The diff coverage is n/a.

❗ Current head b1382c8 differs from pull request most recent head 5c0cfdc. Consider uploading reports for the commit 5c0cfdc to get more accurate results

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5071   +/-   ##
=======================================
  Coverage   72.44%   72.44%           
=======================================
  Files          30       30           
  Lines        1673     1673           
  Branches      366      366           
=======================================
  Hits         1212     1212           
  Misses        398      398           
  Partials       63       63           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33ee184...5c0cfdc. Read the comment docs.

@edvincent
Copy link
Contributor Author

So the failure seems to be because of some dependencies missing in the node_modules from the original install... Which my guess is because something is failing to be installed. Will be playing around with it (expect a bunch of updates to this PR to "test" the CI) and try to find the culprit.

@code-asher
Copy link
Member

GitHub seems to have ignored my email response so I will paste it below, apologies if it does eventually come in and the comment gets duplicated:


I love the detailed description!

What is used to install is completely meaningless to the end-user. If
only, by not using yarn even when invoked with yarn, we can guarantee
the deterministic versions for the bundled vscode. Thoughts?

My first thought when I mentioned this was that the user might have
yarn installed but not npm so switching the package manager from
under the user could cause installation failures (npm: command not found
or something like that).

Buuuut...is it likely someone has Node installed but not npm? I know
some distros package them separately but it still seems unlikely.

The second thought I had was that using the same package manager
throughout the entire installation process /feels/ more correct to me.
I doubt it would cause issues in practice but it seems like it would be
less of a surprise ("Wait why is it calling npm when I ran yarn?").

@edvincent
Copy link
Contributor Author

Ok new tentative out, with:

  1. Using npm or yarn, depending on what was initially used.
  2. Generating the shrinkwraps when generating the release, rather than initially.
  3. Removing @parcel/node-addon-api from the shrinkwrap, as it's a folder in node_modules which is actually not a package and was making the whole process choke.

(1) and (2) together (should) also help the other issue I was trying to track down, which happens because the tests were failing for the standalone release as it wasn't installing some of the devDependencies that we were removing too early from the shrinkwrap file.

@edvincent
Copy link
Contributor Author

FWIW, I'm playing around in edvincent#1 (comment) to be able to run the CI/CD and pinpoint some of the culprits. To avoid the flood and needing approvals for the workflows to run. Sorry it's taking so long, but trying to do it the best possible way this time 💪

@edvincent
Copy link
Contributor Author

Ok, I think I got it this time! Moved to generate the shrinkwraps in the release:standalone command, which basically allows not having to know which package.json is being copied etc...

Also updated the overview of this PR to cover some of the reasoning.

Thoughts? I think I still need to add something so that the NPM publish CI step calls this release:standalone too?

@code-asher
Copy link
Member

Awesome work! I love your comments; very thorough and made it super easy to understand the rationale.

I think I still need to add something so that the NPM publish CI step calls this release:standalone too?

I think leaving it in build-release.sh is the right move. We use the result of that script in both the NPM publish step and standalone step.

@code-asher
Copy link
Member

code-asher commented Apr 22, 2022

Ohh I see in your PR description you are doing it there because it runs after yarn --production. Hmm...

@edvincent
Copy link
Contributor Author

Yep.. We need the node_modules to be populated to be able to generate the shrinkwrap file...

The other solution would be to still do it in build-release.sh, which means we'd need to yarn install in build-release.sh. That would address this TODO https://github.com/coder/code-server/blob/main/.github/workflows/ci.yaml#L240-L242 - and the NPM release would move to be a yarn pack (rather than a raw tar command). But I'm not sure how would having the node_modules folder from one arch would affect another arch - for the cross-compile sections...

Unless we do it in build-release.sh, with a yarn install, but then generate two type of artifacts - the current one (for the other platforms) and the current one...

Thoughts?

@code-asher
Copy link
Member

Maybe we could just add yarn --production to build-release.sh and it would not take too much longer since it only needs to remove files (but I have no idea how yarn works internally so maybe this takes longer for some reason).

@code-asher
Copy link
Member

code-asher commented Apr 22, 2022

We need the node_modules to be populated

Oh right we have no Node modules at all at that stage, we would need to remove that whole KEEP_MODULES behavior and just keep the modules (in addition to then running yarn --production at the end if I understand correctly).

@code-asher
Copy link
Member

But I'm not sure how would having the node_modules folder from one arch would affect another arch

This is exactly why I have avoided doing that todo so far lol

@edvincent
Copy link
Contributor Author

Today there is never a yarn in build-release.sh - so a yarn --production won't be removing files, but also installing stuff that was never installed.

The only yarn that gets done is at the root of the repo, not inside the release folder no?

EDIT: Yep, what you just said now. Happy to try and take a stab at that if it seems to be the best.

@code-asher
Copy link
Member

Today there is never a yarn in build-release.sh

Ah yup it is done in a previous step and then this step has an option (KEEP_MODULES) for whether to also copy the Node modules or skip them.

@code-asher
Copy link
Member

Yeah I say we go ahead with this plan.

@edvincent
Copy link
Contributor Author

Damn it's actually a bit more nuanced than just removing the KEEP_MODULES behavior... If we run yarn --production, it would also install some of the things we were excluding in the rsync like the node executable, the node_modules.ara and the lib/coder-cloud-agent. Which means the tar -czf command in the ci.yaml would need to replicate/hard-code some of those excludes...

The other solution would be to come back to the original idea of generating the shrinkwrap files from the code-server repo itself, and copy them into release (after removing the devDependencies from it). i.e. we'd need to cd into lib/vscode/remote/ - rather than relying on the fact we copied the right package.json.

BTW - Right now, that KEEP_MODULES never seems used?

@code-asher
Copy link
Member

KEEP_MODULES never seems used

Yup, from what I recall initially we never kept the modules then later KEEP_MODULES was added to make building locally faster (so you would not have to yarn twice if you wanted to test a full build locally). So current the var is only ever used during local development.

it would also install some of the things we were excluding

I think it will be OK to let these get packaged into the tar although I suppose they will just end up getting overwritten later. We could create a new var KEEP_BINARIES or something which acts like the old var except we are keeping Node modules by default now?

@code-asher
Copy link
Member

code-asher commented Apr 25, 2022

Er wait I might have misunderstood. We do already run yarn --production so those things exist in the standalone already but now the npm package will include them when it did not before maybe? In that case maybe we add to the npmignore.

echo "node_modules.asar" > release/.npmignore

@edvincent
Copy link
Contributor Author

edvincent commented Aug 18, 2022

No, I haven't forgotten about this/you folks ❤️ life got in the way...

I'm back trying to fix this, sent a new approach PR (which gets a bit easier thanks to NodeJS 16!). Still a WIP and needs some tweaks, just sharing for awareness. Testing in https://github.com/edvincent/code-server/actions/runs/2883629793

HOWEVER, something that is a bit of a concern to me now seems to be a bug in NPM itself... Despite the shrinkwrap file, installing with the global flag still doesn't seem to respect them 😭

@edvincent
Copy link
Contributor Author

FYI - opened npm/cli#5325 to track that nasty bug about the shrinkwrap file not being respected.

I'll still continue with the PR here (need to clean-up some of the code/options, I used --no-default-rc everywhere but supposed to only be for one of the folders) and we likely will be able to merge it - as the result will be the same as the current state: every install installs the latest version. But as soon as the bug upstream will be fixed, stuff will start working with no action :p

Any thoughts on the approach proposed in the latest code?

Copy link
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome back from life! 😆

This looks really good.

popd
}

create_shrinkwraps() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build-standalone-release.sh script takes the npm package generated in a previous step, installs deps, then packages the result so we can upload a standalone build complete with deps to GitHub. Assuming I have that right (sometimes I get confused with all the steps) generating a shrinkwrap here would not get uploaded to npm.

So I think we will have to do this in build-release.sh which is where we create and upload the npm package artifact that eventually gets uploaded to npm.

And then here we would install with npm instead of yarn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah damn yes, now I think I remember being bugged/blocked by something like that in the past... Gonna double-check that now!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then here we would install with npm instead of yarn.

Can I suggest to avoid doing too many things in the same PR if that works for you:

  • (This PR) Just adding the npm-shrinkwrap.json file - which basically should not change anything else in the development/CI workflow.
  • Another PR to change the release:standalone and other CI scripts to start using npm ci
  • Another PR (when everything works) to amend the doc to recommend npm (which can be done last, as using yarn will still always work).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the incremental approach!

@edvincent edvincent force-pushed the fix-shrinkwrap branch 2 times, most recently from 583dd8b to b8dfaeb Compare August 19, 2022 22:43
@edvincent
Copy link
Contributor Author

Ok... This is what it looks like when doing it from the release script... Which is indeed where it's needed, as the tgz we use to publish are the artifacts from it - not from release standalone. Confirmed from the artifacts the shrinkwrap files and their content from downloading the build artifacts.

It looks a bit worse because shrinkwrapping tries to alter the yarn.lock - so we have to keep it and replace it again. But seems to be the least-worst way - the other being doing a release:standalone before publishing, which would need to install again all the development
dependencies etc...

Copy link
Member

@code-asher code-asher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@code-asher
Copy link
Member

Everything looks good to me so I am going to merge it in. Thanks a million!

@code-asher code-asher merged commit 90f6035 into coder:main Aug 22, 2022
@jsjoeio
Copy link
Contributor

jsjoeio commented Aug 22, 2022

@edvincent really can't thank you enough - it's rare someone from the community takes on a hard problem, takes a break because life, then comes back to finish it 💪🏼 We appreciate you a ton!

@edvincent
Copy link
Contributor Author

Anytime! I like giving back to the community too, especially for stuff I use daily! Plus I got to learn a LOT about NPM's dependency management, which I enjoyed. And thank to both of you for bearing with me during the (long) process.

You can also thank @bpmct and the nice t-shirt he sent me couple of months ago, it definitely played a role in reminding me I still had this PR pending every time I wore it. 👀

Will still send a follow-up PR (likely next week) for the documentation to remind folks to use npm rather than yarn.

@edvincent
Copy link
Contributor Author

Also, whilst confirming the fix of #5174 (comment), realized that the bug npm/cli#5325 is actually less of an issue here - because it doesn't seem to be buggy when installing it from a remote repo!

Which means... There now is a way to install code-server with fully-deterministic dependencies, which will be the same whether npm or a packaged version is used. 🎉🎉

@edvincent edvincent deleted the fix-shrinkwrap branch August 23, 2022 01:15
@code-asher
Copy link
Member

Awesome news!!

@code-asher
Copy link
Member

code-asher commented Oct 11, 2022 via email

@code-asher
Copy link
Member

Ignore this lol GitHub seems to finally be posting comments I made via email months ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Use npm instead of yarn in postinstall script
3 participants