[MRG] FIX segmentation fault on memory mapped contiguous memoryview #21654

lorentzenchr · 2021-11-13T17:05:24Z

What does this implement/fix? Explain your changes.

Hunting for the segmentation fault in #20567 (comment).

Edit: The fix is to not use joblib for creating readonly (memory mapped) arrays, but to use np.memmap instead, at least where aligned arrays are required.

Any other comments?

A segmentation fault happens on Ubuntu_Bionic py37_conda_forge_openblas_ubuntu_1804 when a memory mapped readonly array (via joblib) is passed to a contiguous memoryview in Cython.

lorentzenchr · 2021-11-13T17:30:03Z

This confirms it: The combination of Ubuntu 1804 (or some particular package combination it the CI) and joblib memory mapped read-only arrays passed to contiguous Cython memoryviews (e.g. double x[::1]) and the ReadonlyArrayWrapper from #20903 causes a segmentation fault, see CI buildid=34976.

I'm very sorry to be the one who somehow introduced this bug in scikit-learn.

@jjerphan @TomDLT @thomasjpfan @ogrisel @jeremiedbb @lesteve @da-woods friendly ping

lorentzenchr · 2021-11-13T18:17:17Z

With boundscheck=True, test_readonly_array_wrapper passes successfully, but a lot of kmeans tests fail for various reasons.

da-woods · 2021-11-14T09:18:06Z

Can I propose a slightly modified version of ReadonlyArrayWrapper? This is just for you to test - I'm taking a wild guess at the problem

from cpython.ref cimport Py_INCREF
...  # other cimports as before

cdef class ReadonlyArrayWrapper:
    cdef object wraps

    def __init__(self, wraps):
        self.wraps = wraps

    def __getbuffer__(self, Py_buffer *buffer, int flags):
        request_for_writeable = False
        if flags & PyBUF_WRITABLE:
            flags ^= PyBUF_WRITABLE
            request_for_writeable = True
        PyObject_GetBuffer(self.wraps, buffer, flags)
        if request_for_writeable:
            # The following is a lie when self.wraps is readonly!
            buffer.readonly = False
            buffer.obj = self

    def __releasebuffer__(self, Py_buffer *buffer):
        # restore the state when the buffer was created
        Py_INCREF(self)  # because reassigning buffer.obj decrefs self, and
                        # the specification of __releasebuffer__ ways we shouldn't do that
        buffer.obj = self.wraps
        buffer.readonly = True
        PyBuffer_Release(buffer)

Essentially what I'm doing is

Reassigning the buffer obj to be self (so that when the buffer is released it calls this class rather than wraps). This is a switch from the "rexport" to "redirect" scheme in https://docs.python.org/3/c-api/typeobj.html#c.PyBufferProcs.bf_getbuffer
In the __releasebuffer__ function I'm restoring the state to how it was before we started messing with it.

My thought is that we should give back exactly the same buffer as we got - maybe the mmap is keeping a cache of buffers that it's previously provided, and returning a modified buffer is messing with the cache. I don't quite know how the contiguousness fits with that.

To re-iterate - this is an untested theory but it seems worth a try! (I've given the code a quick test with Numpy and it seems to work, but I don't have joblib installed and have never used it so can't easily test it there)

This reverts commit d9f351e.

lorentzenchr · 2021-11-14T12:09:36Z

@da-woods Unfortunately, this does not fix it. It still gives Fatal Python error: Segmentation fault. It's hard to judge which component is broken: ReadonlyArrayWrapper, joblib, Cython or some other software component of the Ubuntu 1804 CI setup.

da-woods · 2021-11-14T15:23:01Z

Unfortunately, this does not fix it.

That's a shame. Not a huge surprise since it was a wild guess though.

I've tried to reproduce the test locally (in a cutdown version independent from scikit-learn) and I've failed to reproduce it.

My next suggestion for things to test: what happens if you create writeable mmap backed data, and skip creating the array wrapper. Something like:

@pytest.mark.parametrize("dtype", [np.float32, np.float64, np.int32, np.int64])
def test_contig_mmapped(dtype):
    x = np.arange(10).astype(dtype)
    sum_origin = _test_sum(x)
    x_mmap = create_memmap_backed_data(x, mmap_mode="w")
    sum_mmap = _test_sum(x_readonly)
    assert sum_mmap== pytest.approx(sum_origin, rel=1e-11)

That might test you if the issue is creating a contiguous memoryview of (possibly misaligned) mmap-backed data, or the ReadonlyArrayWrapper. (It's possible that this is tested elsewhere of course and I just don't know about it)

This reverts commit e70770f.

da-woods · 2021-11-14T16:22:59Z

Looks like the crash happens in test_contig_mmapped and so isn't related to the readonly wrapper. That suggests to me that it's just related to the alignment issue discussed in the other thread (although I have no idea what exact combination of compiler and CFLAGS cause it).

I wonder if this is something that Cython should error/warn about in it's memoryview initialization? I'll create an issue there to suggest it (although that doesn't help you fix it here of course...)

lorentzenchr · 2021-11-14T16:28:47Z

@da-woods To be sure, I commented out the complete test_readonly_array_wrapper in 6ea9106. Let's see if there is still a segfault.

lorentzenchr · 2021-11-14T19:23:57Z

@da-woods 6ea9106 is confirmation that test_contig_mmapped throws a segfault. So it's not caused by ReadonlyWrapper😅

It's really nice working with you. I hope the next time it's about some uplifting topic and not segfault hunting:smile:

Some details on CI system config and compiler options

Starting: Test Library
==============================================================================
Task         : Command line
Description  : Run a command line script using Bash on Linux and macOS and cmd.exe on Windows
Version      : 2.182.0
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/command-line
==============================================================================
Generating script.
Script contents:
exec build_tools/azure/test_script.sh
========================== Starting Command Output ===========================
/bin/bash --noprofile --norc /home/vsts/work/_temp/dd2f19b3-8708-47ff-bb67-5d093e3c559d.sh

System:
    python: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)  [GCC 9.4.0]
executable: /usr/share/miniconda/envs/testvenv/bin/python
   machine: Linux-5.4.0-1062-azure-x86_64-with-debian-buster-sid

Python dependencies:
          pip: 21.3.1
   setuptools: 58.5.3
      sklearn: 1.1.dev0
        numpy: 1.21.4
        scipy: 1.7.2
       Cython: 0.29.24
       pandas: 1.3.4
   matplotlib: 3.4.3
       joblib: 1.1.0
threadpoolctl: 3.0.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
         prefix: libomp
       filepath: /usr/share/miniconda/envs/testvenv/lib/libomp.so
        version: None
    num_threads: 2

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /usr/share/miniconda/envs/testvenv/lib/libopenblasp-r0.3.18.so
        version: 0.3.18
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 2

# packages in environment at /usr/share/miniconda/envs/testvenv:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
numpy                     1.21.4           py37h31617e3_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openblas                  0.3.18          pthreads_h4748800_0    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1l               h7f98852_0    conda-forge
packaging                 21.2                     pypi_0    pypi
pandas                    1.3.4            py37he8f5f7f_1    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
pillow                    8.4.0            py37h0f21c89_0    conda-forge
pip                       21.3.1             pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0                    pypi_0    pypi
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
py                        1.11.0                   pypi_0    pypi
pyamg                     4.1.0            py37h796e4cb_1    conda-forge
pyparsing                 2.4.7                    pypi_0    pypi
pyqt                      5.12.3           py37h89c1867_8    conda-forge
pyqt-impl                 5.12.3           py37hac37412_8    conda-forge
pyqt5-sip                 4.19.18          py37hcd2ae1e_8    conda-forge
pyqtchart                 5.12             py37he336c9b_8    conda-forge
pyqtwebengine             5.12.1           py37he336c9b_8    conda-forge
pytest                    6.2.5                    pypi_0    pypi
pytest-forked             1.3.0                    pypi_0    pypi
pytest-xdist              2.4.0                    pypi_0    pypi
python                    3.7.12          hb7a2778_100_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.7                     2_cp37m    conda-forge
pytz                      2021.3             pyhd8ed1ab_0    conda-forge
qt                        5.12.9               hda022c4_4    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
scikit-learn              1.1.dev0                  dev_0    <develop>
scipy                     1.7.2            py37hf2a6cf1_0    conda-forge
setuptools                58.5.3           py37h89c1867_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.36.0               h9cd32fc_2    conda-forge
threadpoolctl             3.0.0                    pypi_0    pypi
tk                        8.6.11               h27826a3_1    conda-forge
toml                      0.10.2                   pypi_0    pypi
tornado                   6.1              py37h5e8e339_2    conda-forge
typing-extensions         3.10.0.2                 pypi_0    pypi
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zipp                      3.6.0                    pypi_0    pypi
zlib                      1.2.11            h36c2ea0_1013    conda-forge
zstd                      1.5.0                ha95c52a_0    conda-forge

building 'sklearn.utils._readonly_array_wrapper' extension
compiling C sources
C compiler: gcc -pthread -B /usr/share/miniconda/envs/testvenv/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC

compile options: '-I/usr/share/miniconda/envs/testvenv/lib/python3.7/site-packages/numpy/core/include -Ibuild/src.linux-x86_64-3.7/numpy/distutils/include -I/usr/share/miniconda/envs/testvenv/include/python3.7m -I/usr/share/miniconda/envs/testvenv/include/python3.7m -c'
extra options: '-fopenmp -msse -msse2 -msse3'
In file included from /usr/share/miniconda/envs/testvenv/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1969:0,
                 from /usr/share/miniconda/envs/testvenv/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /usr/share/miniconda/envs/testvenv/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from sklearn/utils/_weight_vector.c:645:
/usr/share/miniconda/envs/testvenv/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
 #warning "Using deprecated NumPy API, disable it with " \
  ^~~~~~~
gcc -pthread -shared -B /usr/share/miniconda/envs/testvenv/compiler_compat -L/usr/share/miniconda/envs/testvenv/lib -Wl,-rpath=/usr/share/miniconda/envs/testvenv/lib -Wl,--no-as-needed -Wl,--sysroot=/ -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/sklearn/utils/_weight_vector.o -Lbuild/temp.linux-x86_64-3.7 -lm -o sklearn/utils/_weight_vector.cpython-37m-x86_64-linux-gnu.so -fopenmp
gcc: sklearn/utils/_readonly_array_wrapper.c

jjerphan · 2021-11-15T07:33:38Z

Your help is very valuable, @da-woods: thank you.

I'm very sorry to be the one who somehow introduced this bug in scikit-learn.

I think it's fine, @lorentzenchr: this is really a problem in a sole configuration and I also have approved this change.

I do not have time to help with this PR because I have more urgent tasks to do, but I will follow it in the meantime and review it then.

ogrisel · 2021-11-15T16:30:25Z

With boundscheck=True, test_readonly_array_wrapper passes successfully, but a lot of kmeans tests fail for various reasons.

Would be nice to open a dedicated issue :)

jeremiedbb · 2021-11-15T16:31:33Z

Would be nice to open a dedicated issue :)

there's a dedicated PR :) #21662

ogrisel

+1 for merging this PR. The new test_memmap_on_contiguous_data test makes it easier to try to reproduce a minimal version of the problem found in #20567 just by setting aligned=False.

sklearn/utils/_readonly_array_wrapper.pyx

ogrisel · 2021-11-15T16:39:36Z

@da-woods any idea if memory aligned data is a general requirement for doing for loops on C-contiguous arrays? It might be related to SIMD vectorization that indeed requires memory aligned but I thought a C-compiler would be clever and generate loopy code that would start the loop with non-aligned arithmetic operations with scalar instructions and then proceed with vector instructions on memory aligned array chunks.

da-woods · 2021-11-15T17:26:45Z

@da-woods any idea if memory aligned data is a general requirement for doing for loops on C-contiguous arrays? It might be related to SIMD vectorization that indeed require memory aligned but I thought C-compiler would be clever and generate loopy code that would start the loop with non-aligned arithmetic operations with scalar instructions and then proceed with vector instructions once safe.

Not that I know of. I had a look at the C code it generates and I don't see why it'd end up with different alignment requirements than the non-contiguous case (it's obviously a bit easier to optimize since it knows the strides, but it's simple pointer arithmetic in both cases)

I failed to reproduce it on my PC (including when using all the sse flags that looked to be set on the pipeline).

It might be interesting to get the compiled module from the tests that crash to try it locally with a debugger and/or to look at the disassembly. But I don't know if that's particularly easy.

ogrisel · 2021-11-15T18:15:05Z

I launched a wheel build in #21677:

to see if the compiler which we use to generate the wheel is impacted by this problem;
maybe to get the resulting wheel on a platform that fails (I think they are archived as build build artifacts but maybe not in case of failure...)

ogrisel · 2021-11-16T10:11:53Z

I am not sure whether or not working with unaligned memory mapped arrays from Cython code should always work or not but given that this is causing hard to debug problems also on pytorch (pytorch/pytorch#35151), maybe it's best to try to re-priorize a fix for joblib/joblib#563 upstream in joblib by padding the generated pickle files to make sure that the loaded arrays are always aligned (based on their dtypes).

jjerphan

Thank you for digging in this issue, @lorentzenchr

Should TODO note be added to think of removing those fixtures once joblib/joblib#563 is fixed?

This review proposes to add such notes and comes with a remark regarding buffer flags.

jjerphan · 2021-11-17T10:44:33Z

sklearn/utils/tests/test_readonly_wrapper.py

+def _create_memmap_backed_data(data):
+    return create_memmap_backed_data(
+        data, mmap_mode="r", return_folder=False, aligned=True
+    )
+
+
+@pytest.mark.parametrize("readonly", [_readonly_array_copy, _create_memmap_backed_data])


Nitpick. Note that this change might not be black-compliant.

Suggested change

def _create_memmap_backed_data(data):

return create_memmap_backed_data(

data, mmap_mode="r", return_folder=False, aligned=True

)

@pytest.mark.parametrize("readonly", [_readonly_array_copy, _create_memmap_backed_data])

# TODO: remove this fixture and its associated parametrization

# once https://github.com/joblib/joblib/issues/563 is fixed.

def _create_aligned_memmap_backed_data(data):

return create_memmap_backed_data(

data, mmap_mode="r", return_folder=False, aligned=True

)

@pytest.mark.parametrize("readonly", [_readonly_array_copy, _create_aligned_memmap_backed_data])

jjerphan · 2021-11-17T10:56:00Z

sklearn/utils/tests/test_testing.py

+
+@pytest.mark.parametrize("dtype", [np.float32, np.float64, np.int32, np.int64])
+def test_memmap_on_contiguous_data(dtype):
+    """Test memory mapped array on contigous memoryview."""
+    x = np.arange(10).astype(dtype)
+    assert x.flags["C_CONTIGUOUS"]
+    assert x.flags["ALIGNED"]
+
+    # _test_sum consumes contiguous arrays
+    # def _test_sum(NUM_TYPES[::1] x):
+    sum_origin = _test_sum(x)
+
+    # now on memory mapped data
+    # aligned=True so avoid https://github.com/joblib/joblib/issues/563
+    # without alignment, this can produce segmentation faults, see
+    # https://github.com/scikit-learn/scikit-learn/pull/21654
+    x_mmap = create_memmap_backed_data(x, mmap_mode="r+", aligned=True)
+    sum_mmap = _test_sum(x_mmap)
+    assert sum_mmap == pytest.approx(sum_origin, rel=1e-11)


Suggested change

@pytest.mark.parametrize("dtype", [np.float32, np.float64, np.int32, np.int64])

def test_memmap_on_contiguous_data(dtype):

"""Test memory mapped array on contigous memoryview."""

x = np.arange(10).astype(dtype)

assert x.flags["C_CONTIGUOUS"]

assert x.flags["ALIGNED"]

# _test_sum consumes contiguous arrays

# def _test_sum(NUM_TYPES[::1] x):

sum_origin = _test_sum(x)

# now on memory mapped data

# aligned=True so avoid https://github.com/joblib/joblib/issues/563

# without alignment, this can produce segmentation faults, see

# https://github.com/scikit-learn/scikit-learn/pull/21654

x_mmap = create_memmap_backed_data(x, mmap_mode="r+", aligned=True)

sum_mmap = _test_sum(x_mmap)

assert sum_mmap == pytest.approx(sum_origin, rel=1e-11)

# TODO: remove once https://github.com/joblib/joblib/issues/563 is fixed.

@pytest.mark.parametrize("dtype", [np.float32, np.float64, np.int32, np.int64])

def test_memmap_on_contiguous_data(dtype):

"""Test memory mapped array on contigous memoryview."""

x = np.arange(10).astype(dtype)

assert x.flags["C_CONTIGUOUS"]

assert x.flags["ALIGNED"]

# _test_sum consumes contiguous arrays

# def _test_sum(NUM_TYPES[::1] x):

sum_origin = _test_sum(x)

# now on memory mapped data

# aligned=True so avoid https://github.com/joblib/joblib/issues/563

# without alignment, this can produce segmentation faults, see

# https://github.com/scikit-learn/scikit-learn/pull/21654

x_mmap = create_memmap_backed_data(x, mmap_mode="r+", aligned=True)

sum_mmap = _test_sum(x_mmap)

assert sum_mmap == pytest.approx(sum_origin, rel=1e-11)

jjerphan · 2021-11-17T10:58:02Z

sklearn/utils/_testing.py

+    if aligned:
+        if isinstance(data, np.ndarray) and data.flags.aligned:
+            # https://numpy.org/doc/stable/reference/generated/numpy.memmap.html
+            filename = op.join(temp_folder, "data.dat")
+            fp = np.memmap(filename, dtype=data.dtype, mode="w+", shape=data.shape)
+            fp[:] = data[:]  # write data to memmap array
+            fp.flush()
+            memmap_backed_data = np.memmap(
+                filename, dtype=data.dtype, mode=mmap_mode, shape=data.shape
+            )
+        else:
+            raise ValueError("If aligned=True, input must be a single numpy array.")
+    else:
+        filename = op.join(temp_folder, "data.pkl")


Suggested change

if aligned:

if isinstance(data, np.ndarray) and data.flags.aligned:

# https://numpy.org/doc/stable/reference/generated/numpy.memmap.html

filename = op.join(temp_folder, "data.dat")

fp = np.memmap(filename, dtype=data.dtype, mode="w+", shape=data.shape)

fp[:] = data[:] # write data to memmap array

fp.flush()

memmap_backed_data = np.memmap(

filename, dtype=data.dtype, mode=mmap_mode, shape=data.shape

)

else:

raise ValueError("If aligned=True, input must be a single numpy array.")

else:

filename = op.join(temp_folder, "data.pkl")

# TODO: remove the aligned case once https://github.com/joblib/joblib/issues/563

# is fixed.

if aligned:

if isinstance(data, np.ndarray) and data.flags.aligned:

# https://numpy.org/doc/stable/reference/generated/numpy.memmap.html

filename = op.join(temp_folder, "data.dat")

fp = np.memmap(filename, dtype=data.dtype, mode="w+", shape=data.shape)

fp[:] = data[:] # write data to memmap array

fp.flush()

memmap_backed_data = np.memmap(

filename, dtype=data.dtype, mode=mmap_mode, shape=data.shape

)

else:

raise ValueError("If aligned=True, input must be a single numpy array.")

else:

filename = op.join(temp_folder, "data.pkl")

jjerphan · 2021-11-17T10:58:39Z

sklearn/utils/_testing.py

+    aligned : bool, default=False
+        If True, if input is a single numpy array and if the input array is aligned,
+        the memory mapped array will also be aligned. This is a workaround for
+        https://github.com/joblib/joblib/issues/563.


Suggested change

aligned : bool, default=False

If True, if input is a single numpy array and if the input array is aligned,

the memory mapped array will also be aligned. This is a workaround for

https://github.com/joblib/joblib/issues/563.

aligned : bool, default=False

If True, if input is a single numpy array and if the input array is aligned,

the memory mapped array will also be aligned. This is a workaround for

https://github.com/joblib/joblib/issues/563 .

jjerphan · 2021-11-17T11:00:37Z

sklearn/utils/tests/test_testing.py

@@ -680,30 +681,59 @@ def test_tempmemmap(monkeypatch):
    assert registration_counter.nb_calls == 2


-def test_create_memmap_backed_data(monkeypatch):
+@pytest.mark.parametrize("aligned", [False, True])


Suggested change

@pytest.mark.parametrize("aligned", [False, True])

# TODO: remove this test parametrization on aligned once

# https://github.com/joblib/joblib/issues/563 is fixed.

@pytest.mark.parametrize("aligned", [False, True])

sklearn/utils/_testing.py

ogrisel · 2021-11-18T13:48:18Z

@lorentzenchr what do you think of @jjerphan's suggestions above? I would like to merge this PR asap because it would be useful to have a low-maintenance fix for some randomly segfaulting tests on NuSVC on our CI (#21702 (review) about fixing #21361).

jjerphan

To move forward, I propose leaving my personal unimportant nitpick. LGTM!

ogrisel · 2021-11-18T13:58:02Z

I propose leaving my personal unimportant nitpick. LGTM!

I think it would be worth adding back the suggested references to joblib/joblib#563 as part of #21702.

lorentzenchr · 2021-11-19T08:50:06Z

Shall I open a dedicated PR for those comments? Sorry, that I did not have the time to respond that fast.

…cikit-learn#21654)

…21654)

…ryview (scikit-learn#21654)" This reverts commit 3f717cc.

…iguous memoryview (scikit-learn#21654)" This reverts commit 3f717cc.

bitsnaps · 2023-05-02T19:17:32Z

I'm getting this issue on a newly fresh conda environnement, the default python version is 3.11 and I tried new env with python v3.7, both causes that issue, the guilty line is: KMeans.fit(X) on a simple KMeans example which works in the previous version.
OS: MacOS 10.14.6
conda: 23.3.1
scikit-learn: 1.2.1
The kernel keeps dying in Jupyter, and the execution on terminal ends with error: Segmentation fault: 11, @lorentzenchr any idea?

glemaitre · 2023-05-02T19:19:03Z

@bitsnaps Could you update to the latest release?

bitsnaps · 2023-05-02T19:27:49Z

@bitsnaps Could you update to the latest release?

I'm not sure if I can do that without breaking anything! when running conda update --all it says nothing to update, if I force upgrade to the latest version with pip I would probably break something else (like I did before), should I have to try on a new environnement?

glemaitre · 2023-05-02T19:31:36Z

The latest release on PyPI, conda-forge and the defaults channel of Anaconda is 1.2.2 and we might have the solve the issue there. You can try on a new minimal environment to check that we actually solve the problem that you pointed out in this version.

Otherwise, you can open a new issue with the example and information regarding your environment.

bitsnaps · 2023-05-03T19:24:10Z

There is something weird here, I tried to create a new env with exactly the same packages/versions, and everything is working fine, even with the old version scikit (v1.0.2), the issue seems to be related to my environnement. Thank you @glemaitre !

TST use contiguous memoryview

6a12101

github-actions bot added module:utils cython labels Nov 13, 2021

lorentzenchr changed the title ~~TST use contiguous memoryview~~ TST ReadonlyArrayWrapper with mmap contiguous memoryview Nov 13, 2021

DEBUG set compiler_directives boundscheck to False

d9f351e

lorentzenchr added the Bug label Nov 13, 2021

FIX __releasebuffer__ in ReadonlyArrayWrapper

e70770f

lorentzenchr mentioned this pull request Nov 14, 2021

FIX out of bounds error on X_indptr in kmeans #21662

Merged

Revert "DEBUG set compiler_directives boundscheck to False"

bb31851

This reverts commit d9f351e.

lorentzenchr changed the title ~~TST ReadonlyArrayWrapper with mmap contiguous memoryview~~ [WIP] TST ReadonlyArrayWrapper with mmap contiguous memoryview Nov 14, 2021

lorentzenchr added 2 commits November 14, 2021 16:38

Revert "FIX __releasebuffer__ in ReadonlyArrayWrapper"

4be7202

This reverts commit e70770f.

TST add test_contig_mmapped

69b9bf2

lorentzenchr force-pushed the debug_mmap_segfault branch from 60960ca to c2bbd1d Compare November 14, 2021 16:26

da-woods mentioned this pull request Nov 14, 2021

[ENH] Have memoryviews check alignment cython/cython#4467

Open

DEBUG skip test_readonly_array_wrapper

6ea9106

lorentzenchr force-pushed the debug_mmap_segfault branch from c2bbd1d to 6ea9106 Compare November 14, 2021 17:16

lorentzenchr changed the title ~~[WIP] TST ReadonlyArrayWrapper with mmap contiguous memoryview~~ [WIP] DEBUG segmentation fault on memory mapped contiguous memoryview Nov 14, 2021

This was referenced Nov 14, 2021

[MRG] Common Private Loss Module with tempita #20567

Merged

CI with boundscheck=False #21668

Closed

lorentzenchr added the segfault label Nov 14, 2021

jjerphan added the No Changelog Needed label Nov 15, 2021

ogrisel approved these changes Nov 15, 2021

View reviewed changes

sklearn/utils/_readonly_array_wrapper.pyx Show resolved Hide resolved

DOC fix docstring of _test_sum

6e756b4

ogrisel mentioned this pull request Nov 15, 2021

DEBUG try to reproduce Cython segfault on non-aligned data #21677

Closed

jjerphan reviewed Nov 17, 2021

View reviewed changes

jjerphan approved these changes Nov 18, 2021

View reviewed changes

jjerphan merged commit 3f717cc into scikit-learn:main Nov 18, 2021

lorentzenchr deleted the debug_mmap_segfault branch November 19, 2021 08:50

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 22, 2021

[MRG] FIX segmentation fault on memory mapped contiguous memoryview (s…

46d5acd

…cikit-learn#21654)

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021

[MRG] FIX segmentation fault on memory mapped contiguous memoryview (s…

dc22a39

…cikit-learn#21654)

samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021

[MRG] FIX segmentation fault on memory mapped contiguous memoryview (s…

ac08308

…cikit-learn#21654)

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021

[MRG] FIX segmentation fault on memory mapped contiguous memoryview (s…

3582872

…cikit-learn#21654)

glemaitre pushed a commit that referenced this pull request Dec 25, 2021

[MRG] FIX segmentation fault on memory mapped contiguous memoryview (#…

47b2e33

…21654)

lesteve added a commit to lesteve/scikit-learn that referenced this pull request Feb 25, 2022

Revert "[MRG] FIX segmentation fault on memory mapped contiguous memo…

7d41541

…ryview (scikit-learn#21654)" This reverts commit 3f717cc.

lesteve mentioned this pull request Feb 25, 2022

Check joblib memmap alignment fix #22607

Closed

lesteve added a commit to lesteve/scikit-learn that referenced this pull request Feb 25, 2022

[cd build] Revert "[MRG] FIX segmentation fault on memory mapped cont…

f40cec3

…iguous memoryview (scikit-learn#21654)" This reverts commit 3f717cc.

thomasjpfan mentioned this pull request Apr 18, 2022

Pytest gives segmentation fault on fresh clone #17582

Closed

Uh oh!

[MRG] FIX segmentation fault on memory mapped contiguous memoryview #21654

[MRG] FIX segmentation fault on memory mapped contiguous memoryview #21654

Uh oh!

Conversation

lorentzenchr commented Nov 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

lorentzenchr commented Nov 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Nov 13, 2021

Uh oh!

da-woods commented Nov 14, 2021 • edited by jjerphan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Nov 14, 2021

Uh oh!

da-woods commented Nov 14, 2021 • edited by jjerphan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

da-woods commented Nov 14, 2021

Uh oh!

lorentzenchr commented Nov 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Nov 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjerphan commented Nov 15, 2021

Uh oh!

ogrisel commented Nov 15, 2021

Uh oh!

jeremiedbb commented Nov 15, 2021

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel commented Nov 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

da-woods commented Nov 15, 2021

Uh oh!

ogrisel commented Nov 15, 2021

Uh oh!

ogrisel commented Nov 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjerphan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjerphan Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

jjerphan Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

jjerphan Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

jjerphan Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

jjerphan Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel commented Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Nov 13, 2021 •

edited

Loading

lorentzenchr commented Nov 13, 2021 •

edited

Loading

da-woods commented Nov 14, 2021 •

edited by jjerphan

Loading

da-woods commented Nov 14, 2021 •

edited by jjerphan

Loading

lorentzenchr commented Nov 14, 2021 •

edited

Loading

lorentzenchr commented Nov 14, 2021 •

edited

Loading

ogrisel commented Nov 15, 2021 •

edited

Loading

ogrisel commented Nov 16, 2021 •

edited

Loading

jjerphan left a comment •

edited

Loading

ogrisel commented Nov 18, 2021 •

edited

Loading

ogrisel commented Nov 18, 2021 •

edited

Loading