Skip to content

ENH: Implement essential intrinsics required by the upcoming SIMD optimizations(0) #22306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 25, 2022

Conversation

seiko2plus
Copy link
Member

This pullrequest adds:

  • intrinsics to check true cross all vector lanes
    npyv_any_##SFX: returns true if any of the elements is not equal to zero
    npyv_all_##SFX: returns true if all elements are not equal to zero

  • max/min that reverse IEC 60559's NaN behavior(propagates NaNs) for float data types
    npyv_maxn_##SFX
    npyv_minn_##SFX

  • max/min reduction for all float and integer vector data types
    npyv_reduce_max_##SFX
    npyv_reduce_min_##SFX

  • max/min reduction supports IEC 60559 for float data types
    npyv_reduce_maxp_##SFX
    npyv_reduce_minp_##SFX

  • max/min reduction reverse IEC 60559's NaN behavior(propagates NaNs) for float data types
    npyv_reduce_maxn_##SFX
    npyv_reduce_minn_##SFX

  • intrinsics to extract the first vector lane:
    npyv_extract0_##SFX
    npyv_extract0_##SFX

And removes:

  • local implementation of max/min reduce intrinsics

  max/min that reverse IEC 60559's NaN beahvior(propagates NaNs) for float data types
    npyv_maxn_##SFX
    npyv_minn_##SFX

  max/min reduction for all float and integer vector data types
    npyv_reduce_max_##SFX
    npyv_reduce_min_##SFX

  max/min reduction supports IEC 60559 for float data types
    npyv_reduce_maxp_##SFX
    npyv_reduce_minp_##SFX

  max/min reduction reverse IEC 60559's NaN beahvior(propagates NaNs) for float data types
    npyv_reduce_maxn_##SFX
    npyv_reduce_minn_##SFX

  also, this patch implements new intrinsics to extract the first vector lane:
    npyv_extract0_##SFX
    npyv_extract0_##SFX
@seiko2plus seiko2plus added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Sep 17, 2022
@seiko2plus seiko2plus force-pushed the npyv_new_intrinsics_sep2022_vol0 branch 9 times, most recently from 57e0148 to e3ad145 Compare September 19, 2022 05:47
@seiko2plus seiko2plus marked this pull request as ready for review September 19, 2022 05:48
  npyv_any_##SFX: returns true if any of the elements is not equal to zero
  npyv_all_##SFX: returns true if all elements are not equal to zero
@seiko2plus seiko2plus force-pushed the npyv_new_intrinsics_sep2022_vol0 branch from e3ad145 to 6ef4c8b Compare September 19, 2022 06:27
@seiko2plus
Copy link
Member Author

cc @mattip

@charris
Copy link
Member

charris commented Sep 19, 2022

The errors are unrelated, they are on account of log length limitations on travis. I made a conservative fix for that before, looks like it needs to be more drastic.

@charris
Copy link
Member

charris commented Sep 20, 2022

close/reopen

@charris charris closed this Sep 20, 2022
@charris charris reopened this Sep 20, 2022
continue
_min = self.min(vdata_a, vdata_b)
assert _min == data_min
chk_nan = {"xp": 1, "np": 1, "nn": 2, "xn": 2}.get(intrin[-2:], 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be clearer as part of the parametrize but doesn't matter. I am mostly curious what min/max implement for float values? Does the result depend on the order?

In any case, looked at the code and does look good to me (not that I am very fluid at simd). Not sure if the tests cover all the permutations they could for reductions, but I also trust our integration tests for that.

@mattip will you have another quick look?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am mostly curious what min/max implement for float values? Does the result depend on the order?

no, it doesn't, just check the tail of the intrinsic name to determine the NaN behavior.

Not sure if the tests cover all the permutations they could for reductions

I'm positive about it.

Copy link
Member Author

@seiko2plus seiko2plus Sep 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on refactoring the whole testing unit starting from _simd module to count more on parametrizing rather than inheritance.
see the new numpy/core/tests/test_simd.py part of #21057

@mattip mattip merged commit d66ca35 into numpy:main Sep 25, 2022
@mattip
Copy link
Member

mattip commented Sep 25, 2022

Thanks @seiko2plus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants