-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Build libgomp (gcc-13) from src on AArch64 #152361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gh/fadara01/1/base
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152361
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit c4c280e with merge base d9426a8 ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "module: arm" |
@pytorchbot label "ciflow/linux-aarch64" |
It generally looks good to me, I just have a few questions. What's the motivation for this? Also, it would be great to motivate the build/link flags with some comments. Did you get them from somewhere else? If so it would be worth a link so that we can keep them updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor comments, but generally: this looks great, thank you for the detailed example and graphs!
# rpm --eval '%{optflags}' | ||
# rpm --eval '%{build_ldflags}' | ||
# | ||
# I had to remove the following flags because they didn't compile for this version of libgomp: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know what impact this might have?
@pytorchbot rebase -i |
❌ 🤖 pytorchbot command failed:
Try |
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat conflicted about this change. I.e. perf benefits are clear, so we should update, but on the other hand:
- We should not be in business of building basic OS components, like compiler, OpenMP runtime, etc, but rather rely on system vendors to provide them
- I.e. it would be good to make similar change upstream against https://github.com/pypa/manylinux (unless it's already there for later versions)
- Something tells me there probably already a binary copy available inside /opt/rh/gcc-toolset-${GCCTOOLSET_VERSION}/root/usr/lib64
- Also see my comment about systems with 64kb pagesize (RHEL I guess) (@atalman do we run any AlmaLinux tests on aarch64?)
- If you are building from source updating to a later version of the library, why not update to the latest currently available?
- To have better confidence in the update, it would be good to make sure that same library is also used for all CI tests
- And not to mention that it would be good to avoid version discrepancies across arches (i.e. both arm, powerpc and x86 should link again the same version of OpenMP library)
.ci/docker/common/install_libgomp.sh
Outdated
|
||
cd /usr/local/src | ||
# fetch source for gcc 11 | ||
curl -LO https://ftp.gnu.org/gnu/gcc/gcc-11.4.0/gcc-11.4.0.tar.xz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftp.gnu.org are known to be unstable and somewhat antagonistic towards CI systems (i.e. it often rejects requests to download with HTTP/503, but it never happens when user downloads the file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, it's very slow but hasn't been failing in our downstream CI.
I now clone from https://gcc.gnu.org/git/gcc.git
which is much faster and more secure
@pytorchbot rebase |
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
Successfully rebased |
I agree, with you on this, I only did this because I couldn't find any other option.
Yeah, later versions - i.e. manylinux 2.34 (AlmaLinux 9) has a newer version of libgomp which yields the same gains compared to the older version, but I assumed PyTorch are not ready to move to that version yet because it won't be compatible with Ubuntu versions < 21.10.
Yeah that was my impression too, but there's only a
If we list all available libgomp packages in our AlmaLinux 8 image we only get version 8.5
I replied as a review comment. 64kb pagesize will be handled by default.
Good point, when I raised this (I think) we were using gcc-11 to build pytorch and I thought I'd keep the libgomp version to match the compiler versions. I updated to libgomp 13 as now build with gcc-13.
Unfortunately, as it currently stands, the linux-aarch64 workflow builds the pytorch wheel in jammy-linux which will not test this change, as this targets wheels built with manylinux (i.e. release wheels). On a related note, we're planning to address the issue with building wheels in manylinux for releases vs building wheels in jammy for pre-commit testing. Please let me know if there's a good reason for why CI doesn't currently build its wheel in manylinux. |
I'm not sure what is happening with this PR but just to let you know that this stack of PR's i've created might affect this #160079 . Not sure if this is installing libgomp to a different location, or if there are other libgomp versions that the linker might find. Currently AArch64 manylinux uses |
Stack from ghstack (oldest at bottom):
Context: see #155795
To understand the effect of this libgomp version update, I benchmarked all the following combinations on Arm Neoverse-V1 CPUs:
Below are sample plots showing the percentage of speedup (i.e. 100% means 2x speedup) with the new libgomp (built from source) vs. the current state
We can see that updating libgomp:
benchmark script:
cc @malfet @snadampal @milpuz01 @aditew01 @nikhil-arm @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov