-
Notifications
You must be signed in to change notification settings - Fork 18k
x/build: add LUCI linux-s390x builder #67307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Change https://go.dev/cl/617359 mentions this issue: |
Hostname for the builder: linux-s390x-ibm |
Thanks. CC @mknyszek. |
Here's the certificate: |
It's a little annoying, but we can fit the IBM instructions on top of the regular state, avoiding more intrusive interventions. Going forward we should not accept assembly that replaces the whole implementation, because it doubles the work to do any refactoring like the one in this chain. Also, it took me a while to find the specification of these instructions, which should have been linked from the source for the next person who'd have to touch this. Finally, it's really painful to test this without a LUCI TryBot, per golang#67307. For golang#69536 Change-Id: I90632a90f06b2aa2e863967de972b12dbaa5b2ae
It's a little annoying, but we can fit the IBM instructions on top of the regular state, avoiding more intrusive interventions. Going forward we should not accept assembly that replaces the whole implementation, because it doubles the work to do any refactoring like the one in this chain. Also, it took me a while to find the specification of these instructions, which should have been linked from the source for the next person who'd have to touch this. Finally, it's really painful to test this without a LUCI TryBot, per #67307. For #69536 Change-Id: I90632a90f06b2aa2e863967de972b12dbaa5b2ae Reviewed-on: https://go-review.googlesource.com/c/go/+/617359 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> Reviewed-by: Daniel McCarney <daniel@binaryparadox.net> Reviewed-by: Roland Shoemaker <roland@golang.org>
Change https://go.dev/cl/636055 mentions this issue: |
TestXAESAllocations fails like #70448, and crypto/rand's fails in FIPS mode. We can't keep chasing these without even a LUCI builder. Updates #67307 Change-Id: I5d0edddf470180a321dec55cabfb018db62eb940 Reviewed-on: https://go-review.googlesource.com/c/go/+/636055 Auto-Submit: Filippo Valsorda <filippo@golang.org> Reviewed-by: Roland Shoemaker <roland@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Carlos Amedee <carlos@golang.org>
@mknyszek I am facing issue while following steps mentioned above for LUCI builder setup. After step-03, I have done as follow. Note: Default builder machines are running under user(linux1)
After this when I verify bot's start-up log, I am encountering below verbose
But when I tried manually running "/home/a2/go/bin/bootstrapswarm -hostname linux-s390x-ibm" I am getting status code 401 error(authentication). I tried inspecting/addressing the issue but I could not make much progress. |
@srinivas-pokala Thanks for working on this. I looked at the logs on our end for error details. I'm seeing 403s that are failing because the bot ID being reported is "go-s390x01" instead of the expected "linux-s390x-ibm", which results in an "Bot ID doesn't match the token used" error. Can you check if the bootstrapswarm binary you're using is the latest version available in x/build? Looking at its code, if the A 401 that I saw failed with the error that the token was expired 4 hrs earlier; perhaps /var/lib/luci_machine_tokend/token.json stopped being refreshed at the time you tried to run bootstrapswarm? |
@dmitshur Thank's for the reply.
Yes I have checked it. It's latest only.
So, I am suspecting this could be causing bootstarpswarm to fail and not taking into if case of metadata.OnGCE() path. ` |
@dmitshur I tried inspecting further bootstrapswarm binary by running manually it got the below tailored verbose.
From the bt we can see it's failing for get_cpuinfo, I am suspecting facing any of challenges with this function handling, do we need to to have some changes for s390x similar to other ports?? |
Thanks for getting to that trace. Yes, it looks like swarming_bot's logic for auto-detecting information about the CPU in order to populate its dimensions doesn't work as expected on the machine you're running it on. I see a TODO mentioning s390x here: Though the exception you're seeing is happening as part of get_cpu_dimensions, which is used to populate the "cpu" dimension here, which leads to the problem happening on this line: Note that section has a comment "# Intel". If it shouldn't apply, perhaps the if condition should be updated to also check for 'flags' being present in values, not only 'vendor_id'. For the purposes of this builder, the "cipd_platform" dimension needs to be computed as "linux-s390x". Other dimensions aren’t currently used in our configuration and so their exact values are not critical, but of course their computation needs not to throw an exception. If you're okay with navigating the LUCI contribution process and sending a CL directly, that would be an ideal way to make progress on this builder. Since you have access to the machine where swarming bot will run, you can test your changes there, and we'll help with code review. You can refer to a past issue where swarming bot changes were required for reference, for example please see #64660 and its CLs like crrev.com/c/5792941 (CC @prattmic, @pmur). |
@dmitshur I have followed the contribution steps mentioned here: https://chromium.googlesource.com/infra/luci/luci-py/+/HEAD,
It looks like python3.11 missing in the depot_tools, I could not find the path mentioned, could you help me on this issue. |
There are two pages linked from https://chromium.googlesource.com/infra/infra/+/main/doc/source.md that might be most relevant, if you haven't already looked there:
Are you mailing the CL from a linux/s390x machine, or is this happening on something like linux/amd64? From the error message, if this is on s390x, it might be indeed that cpython3 isn't available for linux-s390x, so you could try setting VPYTHON_BYPASS as suggested and try using system python instead. That env var needs to be set to a specific value as documented here:
Another approach you can try, especially if the above doesn't work well, is to create the Gerrit CL using Gerrit's lower-level primitives (documented here):
Assuming HEAD is pointing to the commit you wish to mail. This has potential downsides in that if there are some extra steps taken by |
@dmitshur thank's for the details.
I am running on linux/s390x machine, I could able to resolve the issue and raised the CL. Could you please the review the CL. |
@srinivas-pokala I believe the change in crrev.com/c/6439429 is rolled out and you should be able to try starting up the builder again. |
@dmitshur Thank's for the help.
It looks like failure functions which we were seeing above in get_dimesions() function, Is there something I'm doing wrong here. |
@dmitshur can you help me on the above issue |
From looking at the source for those functions (here), it looks like "Can't open display :0.0" with exit code 1 is the expected outcome when there's no display attached. The code prints that as a log message but otherwise considers there not to be a display attached for the purposes of computing the bot's dimensions. So I think that part is working as intended and not causing problems. Looking at the build history for the bot at https://chromium-swarm.appspot.com/bot?id=linux-s390x-ibm, I see 3 builds from April 22-23. The first one was entirely successful, completing in 34 min: https://ci.chromium.org/b/8716953408078429585. That's great to see! The following 2 builds failed, and I see a Since the bot worked on the first time, I suggest trying it again while making sure it's executed as the swarming user, and keeping an eye on its status on the https://chromium-swarm.appspot.com/bot?id=linux-s390x-ibm page as well as local logs you have (ignoring the "xrandr returned exit code 1" messages since they're not a problem). |
There currently isn't a LUCI builder that tests the linux/s390x port (other than the misc-compile builder, which tests only that the port compiles). This is the tracking issue for it.
The next steps that a builder owner will need to follow to make progress here are documented https://go.dev/wiki/DashboardBuilders#luci-builders.
The text was updated successfully, but these errors were encountered: