-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[OpenMP][Clang] Use ATTACH
map-type for list-items with base-pointers.
#153683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[OpenMP][Clang] Use ATTACH
map-type for list-items with base-pointers.
#153683
Conversation
For the following: ```c int *p; \#pragma omp target map(p[0]) // (A) (void)p; \#pragma omp target map(p) // (B) (void)p; \#pragma omp target map(p, p[0]) // (C) (void)p; \#pragma omp target map(p[0], p) // (D) (void)p; ``` For (A), the pointer `p` is predetermined `firstprivate`, so it should be (and is) captured by-copy. However, for (B), (C), and (D), since `p` is already listed in a `map` clause, it's not predetermined `firstprivate`, and hence, should be captured by-reference, like any other mapped variable. To ensure the correct handling of (C) and (D), the following changes were made: 1. In SemaOpenMP, we now ensure that `p` is marked to be captured by-reference in these cases. 2. We no longer ignore `map(p)` during codegen of `target` constructs, even if there's another map like `map(p[0])` that would have been mapped using a PTR_AND_OBJ map. 3. For cases like (D), we now handle `map(p)` before `map(p[0])`, so the former gets the TARGET_PARAM flag and sets the kernel argument.
… before their derefs.
The output of the compile-and-run tests is incorrect. These will be used for reference in future commits that resolve the issues. Also updated the existing clang LIT test, target_map_both_pointer_pointee_codegen.cpp, with more regions and more narrowed-down update_cc_test_checks filters.
…mapped-ptrs-by-ref
…ion-using-attach-maptype
…ion-using-attach-maptype
This patch introduces libomptarget support for the ATTACH map-type, which can be used to implement OpenMP conditional compliant pointer attachment, based on whether the pointer/pointee is newly mapped on a given construct. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps can be emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Without this map-type, the two possible maps emitted by clang: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, which are both incorrect. In terms of implementation, maps with the ATTACH map-type are handled after all other maps have been processed, as it requires knowledge of which new allocations happened as part of the construct. As per OpenMP 5.0, an attachment should happen only when either the pointer or the pointee was newly mapped while handling the construct. Maps with ATTACH map-type-bit do not increase/decrease the ref-count. With OpenMP 6.1, `attach(always/never)` can be used to force/prevent attachment. For `attach(always)`, the compiler will insert the ALWAYS map-type, which would let libomptarget bypass the check about one of the pointer/pointee being new. With `attach(never)`, the ATTACH map will not be emitted at all. The size argument of the ATTACH map-type can specify values greater than `sizeof(void*)` which can be used to support pointer attachment on Fortran descriptors. Note that this also requires shadow-pointer tracking to also support them. That has not been implemented in this patch. This was worked upon in coordination with Ravi Narayanaswamy, who has since retired. Happy retirement, Ravi!
This patch introduces libomptarget support for the ATTACH map-type, which can be used to implement OpenMP conditional compliant pointer attachment, based on whether the pointer/pointee is newly mapped on a given construct. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps can be emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Without this map-type, the two possible maps emitted by clang: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, which are both incorrect. In terms of implementation, maps with the ATTACH map-type are handled after all other maps have been processed, as it requires knowledge of which new allocations happened as part of the construct. As per OpenMP 5.0, an attachment should happen only when either the pointer or the pointee was newly mapped while handling the construct. Maps with ATTACH map-type-bit do not increase/decrease the ref-count. With OpenMP 6.1, `attach(always/never)` can be used to force/prevent attachment. For `attach(always)`, the compiler will insert the ALWAYS map-type, which would let libomptarget bypass the check about one of the pointer/pointee being new. With `attach(never)`, the ATTACH map will not be emitted at all. The size argument of the ATTACH map-type can specify values greater than `sizeof(void*)` which can be used to support pointer attachment on Fortran descriptors. Note that this also requires shadow-pointer tracking to also support them. That has not been implemented in this patch. This was worked upon in coordination with Ravi Narayanaswamy, who has since retired. Happy retirement, Ravi!
…on-using-attach-maptype
Most tests are either compfailing or runfailing. They should start passing once we start using ATTACH map-type based codegen. (#153683) Even after they start passing, there are a few places where the EXPECTED and actual CHECKs are different, due to two main issues: * use_device_ptr translation on `&p[0]` is not succeeding in looking-up a previously mapped `&p[1]` * privatization of byref use_device_addr operands is not happening correctly. The above should be fixed as separate standalone changes.
ATTACH
map-type for list-items with base-pointers.
…base-ptrs. These have been pulled out of the codegen PR llvm#153683, to reduce the size of that PR.
I've spent some time looking through these patches. First off, I see that the test case from #141042 is partially fixed -- it works as written in the issue, but not if the commented-out
This gives mappings like this:
I think entry 2 is bogus, particularly the "size 1" bit. At a high level, this patch series tackles the problem of parsing complex expressions by keying off the attach-pointer component, whereas my (admittedly less developed!) patch series adjusts the partial-struct (overlap) handling code to group parts of the expression by "containing struct" base. That's similar but not quite the same, and I'm not sure how to reconcile the two approaches, or even if that's necessary -- but if it's not, then I'm not sure how to handle cases such as the above. (This one, I mean: #153672) When I worked on this problem for GCC, I ended up creating a kind of tiny DSL just to parse expressions for mapping, described briefly in this patch submission: https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627897.html For LLVM, I don't think we should use exactly the same approach, but I was thinking of something conceptually similar, probably along the lines of an iterator that walks over each "chunk" of an expression, with members to query things like attachment points, containing struct bases, innermost and outermost component accesses, or the "leaf"-type of the chunk (array section expr, array shaping expr, and so on). Incrementing the iterator would then (I think) move to the next indirection level, or the next containing-struct base, with details of course TBD. That would allow the rather complex and hard-to-follow (existing) logic in (That's not the same as the |
Also improves one debug dump regarding pointer-attachment.
Thanks for trying the patch. The commented out code from #141042 is expected to work, but as discussed here, the original is not expected to work as per OpenMP. Since there is no map with v as either the list-item, or the base-pointer of a list-item, the implicit data-mapping rule for
Did you try an older version of the patch? With the current version, I get the following:
Aside from grouping containing structures, another reason for grouping component-lists starting with the same variable, by the complexity of their attach-pointers, is that the kernel argument is determined based on whether the pointer is mapped, or its pointee (see #145454, or line clang/lib/CodeGen/CGOpenMPRuntime.cpp:9749 in this PR). Do you think there are any cases that we won't be able to handle with this approach? With this PR, we are able to reuse the existing logic for determining the partial struct, since the grouping of component-lists happens at a higher level. It might still be desirable to update the partial-struct handling if it makes things easier to read/follow. At the same time, it can continue to assume that it's only working with a set of component-lists that will either have no overlap, or belong to the same containing struct. |
It's not impossible... I think that was with hash 087945a (map-ptr-array-section-using-attach-maptype branch). (Edit: yes, user-error on my part here, apologies!)
I'll go back at look at that in more detail, thank you.
I'm not sure yet.
I'm still working to better understand the original code, and your patch set, so please bear with me! General questions:
|
Here's a test case that might need more complex handling for overlapped mappings:
It works (with this patchset) with NO_OVERLAPS defined, but not with it commented out as shown. |
Thanks for the test! It helped identify the need to use the appropriate pointer-type rather than
From OpenMP's perspective, there can only be one ATTACH operation per list-item. If a list-item is a reference variable that's a member of a struct, like: int x;
int p;
struct S {
int &xr = x;
int &pr = p;
};
A user can map both together, like
variable is actually used in the target region, for correctness?) For After the captured vars have been handled, the remaining map clauses are handled in
The attach handling should not affect handling of overlapping members etc, which is handled by the PartialStruct processing. Each attach-ptr-group (list of component-lists with the same attach-ptr) has its own |
This adds support for using
ATTACH
map-type for proper pointer-attachment when mapping list-items that have base-pointers.For example, for the following:
The following maps are now emitted by clang:
Previously, the two possible maps emitted by clang were:
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, both of which are incorrect.
With this change, we are using ATTACH-style maps, like
(A)
, for cases where the expression has a base-pointer. For example:Why a large PR
While it's unfortunate that this PR has gotten large and difficult to review, the issue is that all the functional changes have to be made together, to prevent regressions from partially implemented changes.
For example, the changes to capturing were previously done separately (#145454), but they would still cause stability issues in absence of full attach-mapping. And attach-mapping needs those changes to be able to launch kernels.
We can extract the utilities and functions, like those for finding attach-ptrs, or comparing exprs, out as a separate NFC PR that doesn't call those functions, just adds them (#155625). And maybe the change that adds a new error message for use_device_addr on array-sections with non-var base-pointers can be extracted out too (but that will have to be a follow-up change in that case, and we may get comp-fails with this PR when the erroneous case is not caught/diagnosed).
Grouping of maps based on attach base-pointers
We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like:
We first map
spp
, thenspp[0]
thenspp[0][0]
andspp[0][0].a
.This allows us to also group "struct" allocation based on their attach pointers. This resolves the issues of us always mapping everything from the beginning of the symbol
spp
. Each group is mapped independently, and at the same level, likespp[0][0]
and its memberspp[0][0].a
, we still get map them together as part of the same contiguous structspp[0][0]
. This resolves issue #141042.use_device_ptr/addr fixes
The handling of
use_device_ptr/addr
was updated to use the attach-ptr information, and works for many cases that were failing before. It has to be done as part of this series because otherwise, the switch from ptr_to_obj to attach-style mapping would have caused regressions in existing use_device_ptr/addr tests.Handling of attach-pointers that are members of implicitly mapped structs:
p
below, is a base-pointer in amap
clause on a target construct (likemap(p[0:1])
, and the base of that struct is either thethis
pointer (implicitly or explicitly), or a struct that is implicitly mapped on that construct, we add an implicitmap(p)
so that we don't implicitly map the full struct.Scope for improvement:
findAttachPtrExpr
is fairly simple, and fast.Needs future work:
ref_ptr/ref_ptee
, andattach
map-type-modifier on them.