Skip to content

ValueError: buffer source array is read-only in check_estimator #28026

Closed
@jilljenn

Description

@jilljenn

Describe the bug

I am trying to make a scikit-learn estimator FMClassifier based on Python wrapper pyWFM for C++ library libFM (yes 😅).

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/estimator_checks.py", line 627, in check_estimator
    check(estimator)
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/_testing.py", line 318, in wrapper
    return fn(*args, **kwargs)
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/estimator_checks.py", line 2603, in check_estimators_fit_returns_self
    assert estimator.fit(X, y) is estimator
  File "/home/jj/code/ktm/fm.py", line 40, in fit
    model = fm.run(X, y, X, y)
  File "/home/jj/.local/lib/python3.10/site-packages/pywFM/__init__.py", line 149, in run
    dump_svmlight_file(x_train, y_train, train_path)
  File "/home/jj/code/fare/scikit-learn/sklearn/datasets/_svmlight_format_io.py", line 513, in dump_svmlight_file
    _dump_svmlight(X, y, f, multilabel, one_based, comment, query_id)
  File "/home/jj/code/fare/scikit-learn/sklearn/datasets/_svmlight_format_io.py", line 386, in _dump_svmlight
    _dump_svmlight_file(
  File "sklearn/datasets/_svmlight_format_fast.pyx", line 222, in sklearn.datasets._svmlight_format_fast._dump_svmlight_file
  File "sklearn/datasets/_svmlight_format_fast.pyx", line 133, in sklearn.datasets._svmlight_format_fast.get_dense_row_string
  File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

Possibly related issues:

Steps/Code to Reproduce

from sklearn.datasets import dump_svmlight_file
import sklearn
import numpy as np


class FMClassifier(sklearn.base.BaseEstimator):
    def __init__(self):
        super().__init__()
    def fit(self, X, y):
        with open('tmp.txt', 'wb') as f:
            dump_svmlight_file(X, y, f)
        return self
    def predict_proba(self, X):
        return np.zeros(len(X))


from sklearn.utils.estimator_checks import check_estimator
check_estimator(FMClassifier())

Expected Results

Well I should get to the next error, should I? If it's illegal to write into memory (makes sense) then could it be written in the documentation somewhere?

Actual Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/estimator_checks.py", line 627, in check_estimator
    check(estimator)
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/_testing.py", line 318, in wrapper
    return fn(*args, **kwargs)
  File "/home/jj/code/fare/scikit-learn/sklearn/utils/estimator_checks.py", line 2603, in check_estimators_fit_returns_self
    assert estimator.fit(X, y) is estimator
  File "<stdin>", line 6, in fit
  File "/home/jj/code/fare/scikit-learn/sklearn/datasets/_svmlight_format_io.py", line 510, in dump_svmlight_file
    _dump_svmlight(X, y, f, multilabel, one_based, comment, query_id)
  File "/home/jj/code/fare/scikit-learn/sklearn/datasets/_svmlight_format_io.py", line 386, in _dump_svmlight
    _dump_svmlight_file(
  File "sklearn/datasets/_svmlight_format_fast.pyx", line 222, in sklearn.datasets._svmlight_format_fast._dump_svmlight_file
  File "sklearn/datasets/_svmlight_format_fast.pyx", line 133, in sklearn.datasets._svmlight_format_fast.get_dense_row_string
  File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

Versions

System:
    python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
executable: /usr/bin/python
   machine: Linux-6.2.0-39-generic-x86_64-with-glibc2.35

Python dependencies:
      sklearn: 1.2.dev0
          pip: 22.0.2
   setuptools: 59.6.0
        numpy: 1.23.5
        scipy: 1.9.3
       Cython: 0.29.28
       pandas: 1.3.3
   matplotlib: 3.6.0
       joblib: 1.3.2
threadpoolctl: 3.1.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/jj/.local/lib/python3.10/site-packages/numpy.libs/libopenblas64_p-r0-742d56dc.3.20.so
        version: 0.3.20
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 8

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/jj/.local/lib/python3.10/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so
        version: 0.3.18
threading_layer: pthreads
   architecture: SkylakeX
    num_threads: 8

       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
        version: None
    num_threads: 8

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions