-
Notifications
You must be signed in to change notification settings - Fork 548
Add allocV2 and freeV2 which return cl_mem on OpenCL backends #2911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
|
||
void memLock(const void *ptr) { | ||
memoryManager().userLock(const_cast<void *>(ptr)); | ||
void memLock(const cl::Buffer *ptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the road block for having something like the following:
- use cl_mem everywhere starting from memory manager to internal backend API
- only wrap them in
cl::Buffer(mem_handle, true)
locally where they are needed as cl::Buffers
I haven't analyzed all the pros and cons of above approach. was throwing it out there based on the point that hjaving cl_mem everyone in internal API will make it consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think cl::Buffers are fine internally as long as the native functions are using cl_mem. The mem* functions are used internally and they should be returning cl::Buffers because that is most convenient for our code base. The reason I moved to the nativeAlloc interface is because the memory manager should be compatible with the base OpenCL specification. The C++ interface is not stable enough for that.
add_test(NAME ${target} COMMAND ${target}) | ||
endif() | ||
endforeach() | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What am I missing here ? It feels almost identical to make_test
macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda_add_executable(${target} cuda.cu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much redundancy for single line difference :(
|
||
#pragma GCC diagnostic push | ||
#pragma GCC diagnostic ignored "-Wdeprecated-declarations" | ||
TEST(Memory, AfAllocDeviceCUDA) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like AllocDevicePointerUsability
instead of AfAllocDeviceCUDA
is more readable name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its testing af_alloc_device. I think the name is appropriate here.
@@ -26,7 +26,7 @@ TEST(Memory, recover) { | |||
vec[i] = randu(1024, 1024, 256); // Allocating 1GB | |||
} | |||
|
|||
ASSERT_EQ(true, false); // Is there a simple assert statement? | |||
FAIL(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
// This behavior will change in the future | ||
ASSERT_EQ(payload->lastNdims, 1); | ||
ASSERT_EQ(payload->lastDims, af::dim4(aSize * aSize * aSize * aSize)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests: E2ETest*D
and E2ETest4DComplexDouble
can use a better name like MostRecentArrayCreated_*D
or ``MostRecentArray_*D`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most recent array created is not an appropriate name because it is only one of the many things it tests. in this case we are testing the behavior of the custom memory manager in the case we create a 4D array.
ASSERT_EQ(payload->lastDims, af::dim4(aSize * aSize * aSize * aSize)); | ||
} | ||
|
||
TEST_F(MemoryManagerApi, E2ETestMultipleAllocations) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Don't feel like the prefix E2ETest
is adding anything to the readability factor..
|
||
#pragma GCC diagnostic push | ||
#pragma GCC diagnostic ignored "-Wdeprecated-declarations" | ||
TEST(Memory, AfAllocDeviceCPUC) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AfAllocDeviceCPUC
what does C
in the end mean ? for this and the following tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uses the C api instead of the C++ api
@@ -136,3 +139,123 @@ TEST(OCLCheck, DevicePlatform) { | |||
#else | |||
TEST(OCLExtContext, NoopCPU) {} | |||
#endif | |||
|
|||
#pragma GCC diagnostic push | |||
#pragma GCC diagnostic ignored "-Wdeprecated-declarations" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to figure out a generic way (a custom drop in marco perhaps) to do this for gcc, clang and msvc if possible. Instead of adding compiler specific pragmas in multiple locations.
Not as a part of this PR, of course.
3d54583
to
f0b2664
Compare
* Improve documentation of the alloc and free function * Add tests for memory operations
* Older alloc functions were returning cl::Buffer objects. This behavior is deprecated in favor of cl_mem objects on the OpenCL backend
The af_alloc_device functions were returning C++ objects instead of C objects. This behavior is now decremented in favor of cl_mem objects in the OpenCL backends. The behavior of the CUDA and CPU backends have not changed.
Updated documentation to clarify proper usage and behavior
Deprecated af_alloc_device af_free_device
Added several memory tests.