-
Notifications
You must be signed in to change notification settings - Fork 15k
[libclc] Move mem_fence and barrier to clc library #151446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
__clc_mem_fence and __clc_barrier function have two parameters Scope and MemorySemantics, which are defined in SPIR-V spec. The design allows the clc functions to implement SPIR-V ControlBarrier and MemoryBarrier functions in the future. The default memory ordering in clc is set to SequentiallyConsistent, which is also the default and strongest ordering in OpenCL and C++. The default memory scope in clc is set to memory_scope_device for amdgcn and ptx-nvidiacl since __opencl_c_atomic_scope_all_devices feature macro is not defined for these targets. llvm-diff shows no change to amdgcn--amdhsa.bc and nvptx64--nvidiacl.bc.
#define __CLC_MEM_FENCE_CLC_MEM_SCOPE_SEMANTICS_H__ | ||
|
||
// Scope values are defined in SPIR-V spec. | ||
typedef enum Scope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should stick to OpenCL spec and terminology; I thought clang already had predefined enums for both of these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should stick to OpenCL spec and terminology; I thought clang already had predefined enums for both of these
done, thanks @arsenm
Changed to use clang macro __MEMORY_SCOPE_DEVICE/__MEMORY_SCOPE_WRKGRP and __ATOMIC_SEQ_CST
…s with clang macros
|
||
#include <clc/mem_fence/clc_mem_fence.h> | ||
|
||
void __clc_amdgcn_s_waitcnt(unsigned flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get rid of this; it can be properly implemented with __builtin_amdgcn_fence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get rid of this; it can be properly implemented with
__builtin_amdgcn_fence
I'll follow up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get rid of this; it can be properly implemented with
__builtin_amdgcn_fence
I'll follow up
re-implement with __builtin_amdgcn_fence: #152275
…ltin_amdgcn_fence for AMDGPU It is necessary to add MemorySemantic argument which means the memory or address space to which the memory ordering is applied. The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier instruction. Additionally, the implementation of __clc_mem_fence on Intel GPUs requires the MemorySemantic argument. Using __builtin_amdgcn_fence for AMDGPU is follow-up of llvm#151446 (comment) llvm-diff shows no change to nvptx64--nvidiacl.bc.
…ltin_amdgcn_fence for AMDGPU (#152275) It is necessary to add MemorySemantic argument for AMDGPU which means the memory or address space to which the memory ordering is applied. The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier instruction. Additionally, the implementation of __clc_mem_fence on Intel GPUs requires the MemorySemantic argument. Using __builtin_amdgcn_fence for AMDGPU is follow-up of #151446 (comment) llvm-diff shows no change to nvptx64--nvidiacl.bc.
…d use __builtin_amdgcn_fence for AMDGPU (#152275) It is necessary to add MemorySemantic argument for AMDGPU which means the memory or address space to which the memory ordering is applied. The MemorySemantic is also necessary for implementing the SPIR-V MemoryBarrier instruction. Additionally, the implementation of __clc_mem_fence on Intel GPUs requires the MemorySemantic argument. Using __builtin_amdgcn_fence for AMDGPU is follow-up of llvm/llvm-project#151446 (comment) llvm-diff shows no change to nvptx64--nvidiacl.bc.
__clc_mem_fence and __clc_work_group_barrier function have two
parameters memory_scope and memory_order. The design allows the clc
functions to implement SPIR-V ControlBarrier and MemoryBarrier
functions in the future.
The default memory ordering in clc is set to __ATOMIC_SEQ_CST, which is
also the default and strongest ordering in OpenCL and C++.
OpenCL cl_mem_fence_flags parameter is converted to combination of
__MEMORY_SCOPE_DEVICE and __MEMORY_SCOPE_WRKGRP, which is passed to clc.
llvm-diff shows no change to nvptx64--nvidiacl.bc.
llvm-diff show a small change to amdgcn--amdhsa.bc and the number of
LLVM IR instruction is reduced by 1: https://alive2.llvm.org/ce/z/_Uhqvt