Fixes sub array (opencl, cuda, cpu, oneapi) for orb #3670
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Using sub-arrays in the orb function results in undefined behaviour in the opencl, cpu and cuda backend.
Description
Using sub-arrays for input as filters generates random behaviour.
The orb function uses following sub functions, which needed individual sub-array testing (and corrections if needed):
resize(): Works as expected. A final PR will contain all extended test scripts that did not need corrections.
28bd6b0: Adds test helpers for temporary array formats (JIT, SUB, ...)
Adds the the function toTempFormat to the test helpers. This function generates the different temporary array formats (JIT array, SUB-array, Linear array, ...) used by arrayfire.
07b089f: Increased difficulty of sub-array testing
Some random behaviour persisted so the test conditions are harder now for sub-arrays
- When using without the offset, the full array will be random data
- When reallocating a data buffer based on dim[0]*dim[1]... size, while reusing the info structure, the buffer will be accessed outside boundaries (hoping this generates a segmentation fault)
1a4358c: Fixes sub-array (cuda, opencl) support for harris
- Linear temporary buffers copied the info struct from in, while they are linear and in is strided.
- second_order_deriv kernel was called with blocksize based on #elements of parent of in, iso in (dim[3]*strides[3])
- many memory allocations were based on #elements of parent of in, iso in (dim[3] * strides[3])
1fd4bfd: Fixes sub-array (cpu, cuda, opencl) support for fast
cuda & cpu:
- in idx(), test_pixel(), the elements of in were directly (linear) accessed without using strides
opencl:
- in load_shared_image() ,the elements of in were directly (linear) accessed without using strides and offset
2e72a4d: Fixes sub-array (cpu, cuda, opencl, oneapi) support for convolve
- Kernel object is extended with 2D and 3D memory copy functions
cuda:
- in conv2Helper(), the elements of in were directly (linear) accessed without using strides
- In conv2Helper(), the strided 2D memcopy function is added when needed
- in convolve_3d, convolve2(), the linear memcopy is replaced with strided 3D memcopy function
opencl:
- in conv1(), conv2(), conv3() the linear memcopy is grouped into convNHelper<> function (same structure as in cuda, ...)
- in convNHelper<> the linear memcopy is replaced with strided one
- In conv2Helper(), convSep() the strided 2D memcopy function is added when needed
cpu:
- in convolve2_separable(), the elements of in were directly (linear) accessed without using strides
oneapi:
- local defined linear memcopy function (included also conversion) is replaced by the strided memory function defined in kernel/memcopy.hpp. This strided function is also optimized for linear copies and performs only copy without conversions. (improved speed)
- in conv1(), conv2(), conv3(), convSep() linear copy replace with strided one
ad1413f: Fixes sub-array (cpu, cuda, opencl, oneapi) support for orb
- in harris_response(), controid_angle(), get_pixel() the elements of in were directly (linear) accessed without using strides
- Adds the offset to the cl code files
- in orb (opencl) the info struct was reused, while the buffer allocation was linear
Is this a new feature or a bug fix?
BUG
Why these changes are necessary.
PREVIOUSLY THE RESULT IS UNDEFINED FOR ABOVE SITUATION
Potential impact on specific hardware, software or backends.
ALL platforms (ONEAPI is not supporting orb)
New functions and their functionality.
NO
Can this PR be backported to older versions?
NO
Future changes not implemented in this PR.
NONE, although the function toTempFormat will be used in following PRs.
Changes to Users
Quality improvement
Checklist