Skip to content

BaggingClassifier throws ValueError: WRITEBACKIFCOPY base is read-only #25935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Jakobhenningjensen opened this issue Mar 22, 2023 · 7 comments
Closed
Labels

Comments

@Jakobhenningjensen
Copy link

Jakobhenningjensen commented Mar 22, 2023

Describe the bug

When I use the bagging-classifier in conjunction with LinearSVC it throws ValueError: WRITEBACKIFCOPY base is read-only when n_jobs!=1.

Changing n_jobs to 1 removes the error

Steps/Code to Reproduce

The issue is that I cannot reproduce this error with e.g iris-data but I can't share the dataset since it's company-classified.

My code is as following:

  max_samples_pr_model = 50_000
  n_models = X_train.shape[0] // max_samples_pr_model # 36
  dual = max_samples_pr_model <= X_train.shape[1]  # True

  model_instance = LinearSVC(
      max_iter=5_000,
      dual=dual,
      C=1.0,
      class_weight="balanced")

  model = BaggingClassifier(
      random_state=42,
      n_jobs=-1, # Setting as "1" seems to remove the error
      n_estimators=n_models,
      max_samples=max_samples_pr_model,
      max_features=1.0,
      estimator=model_instance

  )

    model.fit(X_train, y_train)

Expected Results

No error is thrown and it works

Actual Results

The error ValueError: WRITEBACKIFCOPY is thrown with the following stack-trace

joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\externals\loky\process_executor.py", line 428, in _process_worker
    r = call_item()
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\externals\loky\process_executor.py", line 275, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\_parallel_backends.py", line 620, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\utils\parallel.py", line 123, in __call__
    return self.function(*args, **kwargs)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\ensemble\_bagging.py", line 141, in _parallel_build_estimators
    estimator_fit(X_, y, sample_weight=curr_sample_weight)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\svm\_classes.py", line 263, in fit
    X, y = self._validate_data(
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\base.py", line 565, in _validate_data
    X, y = check_X_y(X, y, **check_params)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\utils\validation.py", line 1106, in check_X_y
    X = check_array(
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\utils\validation.py", line 845, in check_array
    array = _ensure_sparse_format(
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\utils\validation.py", line 549, in _ensure_sparse_format
    spmatrix = spmatrix.astype(dtype)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\scipy\sparse\_data.py", line 72, in astype
    self._deduped_data().astype(dtype, casting=casting, copy=copy),
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\scipy\sparse\_data.py", line 32, in _deduped_data
    self.sum_duplicates()
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\scipy\sparse\_compressed.py", line 1118, in sum_duplicates
    self.sort_indices()
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\scipy\sparse\_compressed.py", line 1164, in sort_indices
    _sparsetools.csr_sort_indices(len(self.indptr) - 1, self.indptr,
ValueError: WRITEBACKIFCOPY base is read-only
"""


The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\IPython\core\interactiveshell.py", line 3460, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-e5fe18752d98>", line 17, in <module>
    model.fit(X_train, y_train)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\ensemble\_bagging.py", line 337, in fit
    return self._fit(X, y, self.max_samples, sample_weight=sample_weight)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\ensemble\_bagging.py", line 472, in _fit
    all_results = Parallel(
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\sklearn\utils\parallel.py", line 63, in __call__
    return super().__call__(iterable_with_config)
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\parallel.py", line 1098, in __call__
    self.retrieve()
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\parallel.py", line 975, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\lib\site-packages\joblib\_parallel_backends.py", line 567, in wrap_future_result
    return future.result(timeout=timeout)
  File "C:\Users\my_user\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 445, in result
    return self.__get_result()
  File "C:\Users\my_user\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 390, in __get_result
    raise self._exception
ValueError: WRITEBACKIFCOPY base is read-only

Versions

System:
    python: 3.10.0 (tags/v3.10.0:b494f59, Oct  4 2021, 19:00:18) [MSC v.1929 64 bit (AMD64)]
executable: C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\Scripts\python.exe
   machine: Windows-10-10.0.22621-SP0
Python dependencies:
      sklearn: 1.2.1
          pip: 22.3.1
   setuptools: 67.5.1
        numpy: 1.24.2
        scipy: 1.10.1
       Cython: 0.29.33
       pandas: 1.5.3
   matplotlib: 3.7.1
       joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\Lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll
        version: 0.3.21
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 16
       user_api: openmp
   internal_api: openmp
         prefix: vcomp
       filepath: C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\Lib\site-packages\sklearn\.libs\vcomp140.dll
        version: None
    num_threads: 16
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: C:\Users\my_user\AppData\Local\pypoetry\Cache\virtualenvs\transtotag-6uueIOAW-py3.10\Lib\site-packages\scipy.libs\libopenblas-802f9ed1179cb9c9b03d67ff79f48187.dll
        version: 0.3.18
threading_layer: pthreads
   architecture: Prescott
    num_threads: 16
@Jakobhenningjensen Jakobhenningjensen added Bug Needs Triage Issue requires triage labels Mar 22, 2023
@Jakobhenningjensen Jakobhenningjensen changed the title BaggingClassifier throws ValueError: WRITEBACKIFCOPY base is read-only BaggingClassifier throws ValueError: WRITEBACKIFCOPY base is read-only Mar 22, 2023
@glemaitre
Copy link
Member

glemaitre commented Mar 23, 2023

I see that you are using scikit-learn 1.2.1. Could you update to scikit-learn 1.2.2. We solved a couple of issues linked to read-only memory views.

The fact that disabling the parallelism via n_jobs tells me that it could be linking with joblib creating a memory view before dispatching the jobs.

Could you also provide more information regarding X: is it a sparse matrix and what are the typical dimension (n_samples, n_features)?

@glemaitre
Copy link
Member

glemaitre commented Mar 23, 2023

Also it seems a duplicate to: #6614 and scipy/scipy#8678, #15924

@glemaitre
Copy link
Member

I open scipy/scipy#18192 to try solving the issue.

@Jakobhenningjensen
Copy link
Author

Jakobhenningjensen commented Mar 23, 2023

Yes, even with 1.2.2 the error still occurs.

X is a scipy (CSR) matrix with np.int64 types, having the shape of 825127x104231

@NeuronXCaliber

This comment was marked as off-topic.

@glemaitre
Copy link
Member

glemaitre commented Mar 29, 2023

@NeuronXCaliber Your post does not help at solving the bug. I am hiding it.

@glemaitre
Copy link
Member

I am closing this issue since it should be solved by installing the future SciPy release since scipy/scipy#18192 has been merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants