Skip to content

SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit #17340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 25, 2020

Conversation

seiko2plus
Copy link
Member

@seiko2plus seiko2plus commented Sep 17, 2020

This patch implements NPYV intrinsics for partial and non-contiguous memory access,
which paves the way to replace the raw SIMD kernels in simd.inc.src with the universal intrinsics.

required by #16247

@seiko2plus seiko2plus marked this pull request as draft September 17, 2020 10:29
@seiko2plus seiko2plus force-pushed the npyv_partial_noncont_mem branch 11 times, most recently from ed975c8 to b699b95 Compare September 25, 2020 00:47
@mattip mattip added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Sep 25, 2020
@seiko2plus seiko2plus force-pushed the npyv_partial_noncont_mem branch from b699b95 to b7761ba Compare October 7, 2020 08:32
@mattip
Copy link
Member

mattip commented Oct 7, 2020

This seems to build on PR gh-16782, correct?

@seiko2plus
Copy link
Member Author

seiko2plus commented Oct 7, 2020

@mattip, yes this pull-request temporary merge #16782, so I can be able to test the new intrinsics.

@seiko2plus seiko2plus force-pushed the npyv_partial_noncont_mem branch 4 times, most recently from 19fd9fd to e7e4699 Compare October 8, 2020 13:44
@seiko2plus
Copy link
Member Author

All tests are successfully passed, I will move testing units of the new initrinics to #16782 so we can merge this pr.
https://travis-ci.org/github/numpy/numpy/builds/733983312
https://github.com/numpy/numpy/pull/17340/checks?check_run_id=1226297258

@seiko2plus seiko2plus force-pushed the npyv_partial_noncont_mem branch from 0744337 to 24b5841 Compare October 9, 2020 00:03
@seiko2plus seiko2plus marked this pull request as ready for review October 9, 2020 00:03
@seiko2plus seiko2plus changed the title WIP:SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit SIMD: Add partial/non-contig load and store intrinsics for 32/64-bit Oct 9, 2020
…-bit

   This patch improves the implementation of memory load/store for VSX
@seiko2plus
Copy link
Member Author

@mattip, These intrinsics already been used by #17587 and #16247 and proved a good efficiency almost similar to the replacement raw SIMD in case of AVX2 and AVX512F, provide massive improvements for non-contiguous memory access
in the case of SSE and VSX, on the other hand, NEON/ASIMD shows acceptable improvements but not that wow.

I hope we can merge this pull-request as soon as possible.

@charris
Copy link
Member

charris commented Oct 25, 2020

@seiko2plus I notice that you are still making commits here. Do you feel that there is more to do?

@mattip
Copy link
Member

mattip commented Oct 25, 2020

I was hoping to merge #16782 first, thinking that then we might be able to add some (maybe marked @slow) tests using that infrastructure here. Does that make sense?

@seiko2plus
Copy link
Member Author

seiko2plus commented Oct 25, 2020

@charris, no, the last change I made on this pr was 17 days ago,

 seiko2plus force-pushed the seiko2plus:npyv_partial_noncont_mem branch from bec733b to 1b8637d 17 days ago

the other messages due to build #16247 and #17587 on the top of this pr(reference commit).

@mattip,

I was hoping to merge #16782 first,

I totally agree with you without testing cases it would be chaos.

thinking that then we might be able to add some (maybe marked @slow) tests using that infrastructure here

there's no need for @slow #16782 is too fast in running time the current ratio 1 to 5 seconds depending on
the enabled SIMD extensions. The only issue is the binary size and maybe the building time.

@charris charris merged commit fcba5a6 into numpy:master Oct 25, 2020
@charris
Copy link
Member

charris commented Oct 25, 2020

Thanks Sayed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants