Skip to content

Install full software environment explicitly into the CI rather than rely … #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 9, 2021

Conversation

hwthomas
Copy link
Collaborator

…on the pre-installed github software.

This fixes Issue #64, and will allow PR #63 to pass all checks

@lextm
Copy link

lextm commented Jul 10, 2021

You might find #45 kind of familiar.

@hwthomas
Copy link
Collaborator Author

Can't explain what is going on here. Will investigate further. Apologies for the noise.
@lextm - yes, I found your testing in #45 useful.

@hwthomas
Copy link
Collaborator Author

I've done some more testing, but failed to get the CI workflow to run without error. The current main one has 4 errors, all apparently from changes in the fsharp versions in the github ubuntu 20.04 environment. Installing software explicitly, which seemed to work initially, now fails with over 100 errors, some such as
DOWNLOADNUPKG : Ssl error : 1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED which I have no idea how to tackle!
@lextm: I've copied your amended workflow from #45 to try out using a container, which would hopefully start from a more basic ubuntu repo, but even this now fails to run.
Using VMs, ubuntu 20.04, ubuntu 21.04, and fedora 33 all build and run with no problems, following the Wiki instructions, so it appears to be the github CI workflow environment which is at the root of the problem.

@lextm
Copy link

lextm commented Jul 15, 2021

The cause is rather simple that Microsoft is retiring NuGet.org v2 API, https://docs.microsoft.com/en-us/nuget/nuget-org/overview-nuget-org#api-endpoint-for-nugetorg

If you override NuGet settings to use v3, I think it should work again.

@hwthomas
Copy link
Collaborator Author

hwthomas commented Jul 18, 2021

I've updated the CI workflow file to use the docker image "ubuntu:20.04" and explicitly installed all the necessary packages into that. After some trial-and-error testing in my CI_tests branch, I have a working version that passes the CI runs.

However, after transferring this version to the fix_CI_env branch, the same code fails, with multiple errors of the form DOWNLOADNUPKG : Ssl error : 1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED as seen in the runs above. I'm not sure what to make of this: usually intermittent failures suggest concurrency issues, but perhaps there's another possibility?

@lextm: following your suggestion, I checked the dotdevelop/.NuGet.config file, but it appears to be using v3 already. Is there somewhere else to look?

@hwthomas
Copy link
Collaborator Author

hwthomas commented Jul 19, 2021

The numerous Ssl error : 1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED errors seem to have reduced after installing ca-certificates-mono, but some CI runs still fail, while others succeed, for no obvious reason.
The logs show errors such as "- Something went wrong downloading StrongNamer 0.0.8" and " - Something went wrong downloading FSharp.Compiler.Service 31.0", which may be due to the change from FSharp 4.5 to FSharp 5.0 (or perhaps not!)
Any suggestions would be appreciated.

@lytico
Copy link
Member

lytico commented Aug 4, 2021

@hwthomas ready to merge, what do you say?

@hwthomas
Copy link
Collaborator Author

hwthomas commented Aug 5, 2021

@lytico - I've had several consistent runs now in my fix_CI_env and CI_tests branches, so I'm happy if you are.

The Ssl error : 1000007d problems seemed to be caused by trying to access nuget.org/v2 sites, but I've not been able to track down where these calls originate in the build, and I still don't understand how or why the failures were intermittent.
I've squashed all the 'trial-and-error' commits into a single one to make life a bit simpler

@lytico
Copy link
Member

lytico commented Aug 5, 2021

how or why the failures were intermittent.

i have this on other CI's as well, seems to be a workload problem on github or nuget.org, maybe ...

@lytico
Copy link
Member

lytico commented Aug 5, 2021

I checked the dotdevelop/.NuGet.config file, but it appears to be using v3 already

but maybe on the tons of subprojects they are not?

@hwthomas
Copy link
Collaborator Author

hwthomas commented Aug 5, 2021

i have this on other CI's as well, seems to be a workload problem on github or nuget.org, maybe ...

That's interesting. I was assuming it was a concurrency issue, but couldn't find a way to run the CI in a single thread to check. At least I'm not the only one to have had a problem!

  Install all necessary packages for the dotdevelop build

  Use stable mono repo (rather than preview) so
  mono version is 6.12.0.122; msbuild is 16.6.0
  Squash intermediate (testing) commits into one
@hwthomas
Copy link
Collaborator Author

hwthomas commented Aug 9, 2021

but maybe on the tons of subprojects they are not?

A full search of the repo shows that nuget.org/api/v2 only occurs in the external/fsharpbinding project, in files paket.dependencies and .nuget/NuGet.targets
Successful run CI#80 (https://github.com/dotdevelop/dotdevelop/runs/3232143181?check_suite_focus=true#step:6:4311) shows all the fsharpbinding dependencies being restored without problem, and without needing to download from https://www.nuget.org/api/v2/.
In contrast, failing run CI#78 (https://github.com/dotdevelop/dotdevelop/runs/3161054112?check_suite_focus=true#step:4:4324) tries to access these sources and fails.

@lytico As this is a specific fsharpbinding problem, would it be better to deal with it in a separate Issue/PR?

For this PR though, basing the CI workflow on a minimal docker image still seems worthwhile, as it gives finer control over the software environment, particularly if/when mono and msbuild versions change as .NET5 stabilises and progresses. It also links the CI to the 'Setting up a Build Environment' Wiki pages more clearly.

@lytico lytico merged commit f0019a0 into dotdevelop:main Aug 9, 2021
@lytico
Copy link
Member

lytico commented Aug 9, 2021

A full search of the repo shows that nuget.org/api/v2 only occurs in the external/fsharpbinding project, in files paket.dependencies and .nuget/NuGet.targets

what if changing to nuget.org/api/v3 in this files?
should be another PR

@hwthomas hwthomas deleted the fix_CI_env branch August 11, 2021 09:16
@hwthomas
Copy link
Collaborator Author

what if changing to nuget.org/api/v3 in this files?

Using "https://api.nuget.org/v3/index.json" as the url in the paket.dependencies file gave an error:-

Starting full restore process.
Performance:
- Runtime: 597 milliseconds
Paket failed with
-> The NuGet source https://www.nuget.org/api/v2 for package FSharp.Compiler.Service was not found in the paket.dependencies file with sources [NuGetV3 {Url = "https://api.nuget.org/v3/index.json"; ...}

Where could the reference https://www.nuget.org/api/v2 for package FSharp.Compiler.Service be coming from? As far as I can see it does not occur anywhere in the repo, so how to get rid of it?

Any suggestions or pointers for further investigation/understanding would be much appreciated. Until then I'll leave the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants