-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Replaced np.ndarray
with memview where applicable in linear_model/_cd_fast.pyx
#23147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
np.ndarray
with memview where applicable in linear_model/_cd_fast.pyx
np.ndarray
with memview where applicable in linear_model/_cd_fast.pyx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to add the wrapper when in the calls of enet_coordinate_descent
and enet_coordinate_descent_multitask
since we currently have the same error when passing a dense read-only array. The test then can be extended to cover the dense case
def make_read_only(obj, attr): | ||
curr_attr = getattr(obj, attr) | ||
RO_attr = create_memmap_backed_data(curr_attr) | ||
setattr(obj, attr, RO_attr) | ||
|
||
make_read_only(X, "data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you only make "data" read-only, I think the helper is no longer necessary and you can write it as flat code
For the dense case, don't we modify |
right, my bad. We do support read-only, but only if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am overall +0.5 on this PR. I see ReadonlyArrayWrapper
as a quick hack, so that we can use memoryviews in sklearn/_loss/_loss.pyx.tp
. I prefer not to extend to use ReadonlyArrayWrapper
too much in the library.
As for this PR, I think the tests are worth adding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @Micky774.
I am also not a fan of ReadonlyArrayWrappers
, but if this adds support for read-only datasets, this LTGTM.
To answer your question:
Would it be worth disabling boundscheck for any of these functions? Afaik there's not much of a risk of an out-of-bounds access in these functions.
#21512 centralised all the cython compiler directives. By default, all the checks are deactivated:
scikit-learn/sklearn/_build_utils/__init__.py
Lines 69 to 88 in 5f75acd
# Additional checks for Cython | |
cython_enable_debug_directives = ( | |
os.environ.get("SKLEARN_ENABLE_DEBUG_CYTHON_DIRECTIVES", "0") != "0" | |
) | |
config.ext_modules = cythonize( | |
config.ext_modules, | |
nthreads=n_jobs, | |
compile_time_env={ | |
"SKLEARN_OPENMP_PARALLELISM_ENABLED": sklearn._OPENMP_SUPPORTED | |
}, | |
compiler_directives={ | |
"language_level": 3, | |
"boundscheck": cython_enable_debug_directives, | |
"wraparound": False, | |
"initializedcheck": False, | |
"nonecheck": False, | |
"cdivision": True, | |
}, | |
) |
…del/_cd_fast.pyx` (scikit-learn#23147) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…inear_model/_cd_fast.pyx` (scikit-learn#23147)" This reverts commit 3bb4bad.
…inear_model/_cd_fast.pyx` (scikit-learn#23147)" This reverts commit 3bb4bad.
Reference Issues/PRs
Addresses #10624
What does this implement/fix? Explain your changes.
Replaces
np.ndarray
typing with corresponding memviews insparse_enet_coordinate_descent
andenet_coordinate_descent_multi_task
.Any other comments?
Would it be worth disabling
boundscheck
for any of these functions? Afaik there's not much of a risk of an out-of-bounds access in these functions.