-
-
Notifications
You must be signed in to change notification settings - Fork 702
Ensure max_band vector is aligned on a 16-byte boundary #2144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I found this SO answer: Which seems to suggest that gcc and clang both allow: int a[16] __attribute__((aligned(16))) = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 };
__vector signed int *va = (__vector signed int *) a; It might be a little neater. |
010efef
to
a55513a
Compare
LGTM, thanks Kleis. Can we assume |
A malloc library is expected to provide a better implementation.
It's guaranteed that this is now aligned on a 16-byte boundary.
OK, I added the aligned attribute to |
Looks like it's https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=msvc-160 Though of course I don't know if the other vector stuff is supported. |
libvips/conversion/composite.cpp
Outdated
/* max_band as a vector, for the RGBA case. | ||
*/ | ||
v4f max_band_vec; | ||
#endif /*HAVE_VECTOR_ARITH*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about putting this struct member first?
At the moment, we could perhaps change the size of one the members before this and break alignment. I think it only works at the moment because of the exact number of pointers we have.
A comment about the need to have the vector as the first element would probably be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to put this member first within the VipsCompositeSequence
or move it back to VipsCompositeBase
and put it there as first? I tried the latter version (see commit kleisauke@dcee3d1, which is the smallest change), but that didn't seem to work.
Do you think this could work without those aligned malloc implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant first in VipsCompositeSequence
.
My thinking was that the aligned malloc guarantees that the start of the struct will be on a 16-byte boundary, but max_band_vec
will be aligned on something like (pointer-to-base + 3 * sizeof(pointer) + sizeof(int) + 2 * sizeof(pointer)). If we put the struct members with alignment requirements at the start of the struct, we can guarantee they will stay aligned even if the struct changes.
I'm not sure I understand how these structs are being laid out :( You'd think this would be:
base: composite
base + 4: input_regions
base + 8: composite_regions
base + 12: n
base + 16: enabled
base + 20: p
base + 24: max_band_vec
Which would obviously break alignment. I guess some padding somewhere must be making max_band_vec start at byte 32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I'm also not entirely sure how these structs are laid out, so moving the max_band_vec
member to the first position in the struct makes sense. Done with commit 305714c (tested on Windows 32-bit without any problems).
fwiw, I think the above mentioned commit also reproduced the crash on Linux x64, see:
https://travis-ci.org/github/kleisauke/libvips/jobs/764105312#L2420
This is great work, thank you Kleis! |
No problem! Should I try to compile libvips with MSVC to check whether the vector support and/or the aligned attribute is supported there? fwiw, I saw that this is excluded in the repository provided by @angelmixu (sorry for the ping): |
If you're interested, sure, have a look. I'd be pleased (and surprised, heh) if MSVC supported this kind of vector code. |
Fixes mstorsjo/llvm-mingw#190.