-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Implement string comparison ufuncs (or almost) #21041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
cd47a3f
ENH: Implement string comparison ufuncs (or almost)
seberg 012284b
MAINT: Do not use C99 tagged struct init in C++
seberg 1f3c0fd
BENCH: Add basic string comparison benchmarks
seberg 0dbed94
DOC,STY: Fixup string-comparisons comments based on review
seberg 813e094
ENH: Use `memcmp` because it may be faster for the byte case
seberg ae8db17
TST: Improve string and unicode comparison tests.
seberg 5ac5524
MAINT: Use switch statement based on review
seberg 78e4a60
TST: Make unicode byte-swap test slightly more concrete
seberg c5ffbc5
BUG: Add `np.compare_chararrays` to test and fix typo
seberg e8c4737
TST: Add test for empty string comparisons
seberg 525955c
TST: Fixup string test based on martens review
seberg d64fd76
MAINT: Move definitions back into string_ufuncs.h
seberg 2cc3474
MAINT: Use enum class for comparison operator templating
seberg 77c4910
Template version of add_loop to avoid redundant code
serge-sans-paille 28f8a18
STY: Fixup style, two spaces, error is -1
seberg 8458c60
STY: Small `string_ufuncs.cpp` fixups based on Serge's review
seberg da5503e
MAINT: Fix merge conflict (ensure_dtype_nbo was removed)
seberg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
from __future__ import absolute_import, division, print_function | ||
|
||
from .common import Benchmark | ||
|
||
import numpy as np | ||
import operator | ||
|
||
|
||
_OPERATORS = { | ||
'==': operator.eq, | ||
'!=': operator.ne, | ||
'<': operator.lt, | ||
'<=': operator.le, | ||
'>': operator.gt, | ||
'>=': operator.ge, | ||
} | ||
|
||
|
||
class StringComparisons(Benchmark): | ||
# Basic string comparison speed tests | ||
params = [ | ||
[100, 10000, (1000, 20)], | ||
['U', 'S'], | ||
[True, False], | ||
['==', '!=', '<', '<=', '>', '>=']] | ||
param_names = ['shape', 'dtype', 'contig', 'operator'] | ||
int64 = np.dtype(np.int64) | ||
|
||
def setup(self, shape, dtype, contig, operator): | ||
self.arr = np.arange(np.prod(shape)).astype(dtype).reshape(shape) | ||
self.arr_identical = self.arr.copy() | ||
self.arr_different = self.arr[::-1].copy() | ||
|
||
if not contig: | ||
self.arr = self.arr[..., ::2] | ||
self.arr_identical = self.arr_identical[..., ::2] | ||
self.arr_different = self.arr_different[..., ::2] | ||
|
||
self.operator = _OPERATORS[operator] | ||
|
||
def time_compare_identical(self, shape, dtype, contig, operator): | ||
self.operator(self.arr, self.arr_identical) | ||
|
||
def time_compare_different(self, shape, dtype, contig, operator): | ||
self.operator(self.arr, self.arr_different) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems unrelated to the refactoring part and probably deserves a commit on its own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, although I don't care about when/how it gets in, hehe