-
Notifications
You must be signed in to change notification settings - Fork 15k
[libclc] update __clc_mem_fence: add MemorySemantic arg and use __builtin_amdgcn_fence for AMDGPU #152275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ltin_amdgcn_fence for AMDGPU It is necessary to add MemorySemantic argument which means the memory or address space to which the memory ordering is applied. The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier instruction. Additionally, the implementation of __clc_mem_fence on Intel GPUs requires the MemorySemantic argument. Using __builtin_amdgcn_fence for AMDGPU is follow-up of llvm#151446 (comment) llvm-diff shows no change to nvptx64--nvidiacl.bc.
MEMORY_PRIVATE = 0x1, | ||
MEMORY_LOCAL = 0x2, | ||
MEMORY_GLOBAL = 0x4, | ||
MEMORY_CONSTANT = 0x8, | ||
MEMORY_GENERIC = 0x10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit mask, so write out as 1 << N for the values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit mask, so write out as 1 << N for the values?
done in f25de34
MEMORY_CONSTANT = 1 << 2, | ||
MEMORY_LOCAL = 1 << 3, | ||
MEMORY_GENERIC = 1 << 4, | ||
} MemorySemantic; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this MemorySemantic name copied from somewhere else? I don't think the name is quite right. It's a mask of fenced address spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this MemorySemantic name copied from somewhere else? I don't think the name is quite right.
Yes, it is copied from our closed source code where the enum macros are defined as memory semantic
.
It is also copied from https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Memory_Semantics_-id-
But the enum values here only models SubgroupMemory
, WorkgroupMemory
and CrossWorkgroupMemory
part of SPIR-V Memory_Semantics. So the name is likely not quite right.
It's a mask of fenced address spaces?
Yes, IIUC it should be a mask of fenced address space for AMDGPU.
In term of the usage in Intel GPU and modeling of SPIR-V spec, it is the memory that the fence operation affects and the memory ordering applies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer MemorySemantics
- plural - as you can combine multiple in the same operation. This is also what the SPIR-V spec calls them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer
MemorySemantics
- plural - as you can combine multiple in the same operation. This is also what the SPIR-V spec calls them.
done in 001cec4
@@ -21,4 +22,14 @@ _CLC_INLINE int getCLCMemoryScope(cl_mem_fence_flags flag) { | |||
return memory_scope; | |||
} | |||
|
|||
_CLC_INLINE MemorySemantics getCLCMemorySemantics(cl_mem_fence_flags flag) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't follow clc naming conventions (enum name too), but that already appears to be an issue in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't follow clc naming conventions (enum name too), but that already appears to be an issue in this file
Done in 02e3047
- Added
__CLC_
prefix toMemorySemantics
and its enum values - Added
__opencl_
prefix to functions in libclc/opencl/include/clc/opencl/synchronization/utils.h
…l_ prefix to functions in opencl/synchronization/utils.h
kindly ping @arsenm @frasercrmck to review the last commit. It is a renaming commit. |
It is necessary to add MemorySemantic argument for AMDGPU which means the memory or address space to which the memory ordering is applied.
The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier instruction. Additionally, the implementation of __clc_mem_fence on Intel GPUs requires the MemorySemantic argument.
Using __builtin_amdgcn_fence for AMDGPU is follow-up of #151446 (comment)
llvm-diff shows no change to nvptx64--nvidiacl.bc.