Skip to content

HistGradientBoosting pickle portability between 64bit and 32bit arch #27952

Closed
@stuartlynn

Description

@stuartlynn

Describe the bug

HistGradinetBoosting models use np.intp to represent the feature_idx in TreePredictor nodes

PREDICTOR_RECORD_DTYPE = np.dtype([
('value', Y_DTYPE),
('count', np.uint32),
('feature_idx', np.intp),
('num_threshold', X_DTYPE),
('missing_go_to_left', np.uint8),
('left', np.uint32),
('right', np.uint32),
('gain', Y_DTYPE),
('depth', np.uint32),
('is_leaf', np.uint8),
('bin_threshold', X_BINNED_DTYPE),
('is_categorical', np.uint8),
# The index of the corresponding bitsets in the Predictor's bitset arrays.
# Only used if is_categorical is True
('bitset_idx', np.uint32)
])

This seems to cause issues with using pickled HistGradientBoosting models which are trained on a 64 bit environment, in 32 bit environments ( like Pyodide which is where I encountered this issue).

I know that for a while the other Tree models in sklearn had a similar problem but I am not 100% what the solution was.

Would changing the type to be np.uint32 be an acceptable solution here?

Steps/Code to Reproduce

Steps to reproduce

  1. Train a model in python on a 64 bit system
  2. Pickle the output
  3. Load that pickle on a 32 bit python environment like Pyodide
  4. Attempt to run the prediction on the loaded model

see this repo for a full example: https://github.com/stuartlynn/hist_gradient_boost_bug

Expected Results

The pyodide code to run and give the expected output

Actual Results

Error message

Running the above gives the following error message when trying to execute the Pyodide code

PythonError: Traceback (most recent call last):
  File "/lib/python311.zip/_pyodide/_base.py", line 571, in eval_code_async
    await CodeRunner(
  File "/lib/python311.zip/_pyodide/_base.py", line 394, in run_async
    coroutine = eval(self.code, globals, locals)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<exec>", line 61, in <module>
  File "/lib/python3.11/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", l
    return self._loss.link.inverse(self._raw_predict(X).ravel())
                                   ^^^^^^^^^^^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", l
    self._predict_iterations(
  File "/lib/python3.11/site-packages/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py", l
    raw_predictions[:, k] += predict(X)
                             ^^^^^^^^^^
  File "/lib/python3.11/site-packages/sklearn/ensemble/_hist_gradient_boosting/predictor.py", line 71,
    _predict_from_raw_data(
  File "sklearn/ensemble/_hist_gradient_boosting/_predictor.pyx", line 18, in sklearn.ensemble._hist_gr
ValueError: Buffer dtype mismatch, expected 'intp_t' but got 'long long' in 'const node_struct.feature_

    at new_error (/Users/slynn/tmp/demoland_onnx_test/runner/node_modules/.pnpm/pyodide@0.24.1/node_mod
    at wasm://wasm/02250ad6:wasm-function[295]:0x158827
    at wasm://wasm/02250ad6:wasm-function[452]:0x15fcd5
    at _PyCFunctionWithKeywords_TrampolineCall (/Users/slynn/tmp/demoland_onnx_test/runner/node_modules
    at wasm://wasm/02250ad6:wasm-function[1057]:0x1a3091
    at wasm://wasm/02250ad6:wasm-function[3387]:0x289e4d
    at wasm://wasm/02250ad6:wasm-function[2037]:0x1e3f77
    at wasm://wasm/02250ad6:wasm-function[1064]:0x1a3579
    at wasm://wasm/02250ad6:wasm-function[1067]:0x1a383a
    at wasm://wasm/02250ad6:wasm-function[1068]:0x1a38dc
    at wasm://wasm/02250ad6:wasm-function[3200]:0x2685c5
    at wasm://wasm/02250ad6:wasm-function[3201]:0x26e3d0
    at wasm://wasm/02250ad6:wasm-function[1070]:0x1a3a04
    at wasm://wasm/02250ad6:wasm-function[1065]:0x1a3694
    at wasm://wasm/02250ad6:wasm-function[440]:0x15f45e
    at Module.callPyObjectKwargs (/Users/slynn/tmp/demoland_onnx_test/runner/node_modules/.pnpm/pyodide@0.24.1/node_modules/pyodide/pyodide.asm.js:9:81732)
    at Module.callPyObject (/Users/slynn/tmp/demoland_onnx_test/runner/node_modules/.pnpm/pyodide@0.24.1/node_modules/pyodide/pyodide.asm.js:9:82066)
    at Timeout.wrapper [as _onTimeout] (/Users/slynn/tmp/demoland_onnx_test/runner/node_modules/.pnpm/pyodide@0.24.1/node_modules/pyodide/pyodide.asm.js:9:58562)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7) {
  type: 'ValueError',
  __error_address: 116329376
}

Things I have already checked

  • All versions of the libraries used are the same in both environments
  • Tried with both pickles and joblib

Hacky fix

So what I found to work is the following. In pyodide, after loading the model if we manually change the types of the nodes for the predictors, then the model runs fine. There is an example of this in the example repo

Y_DTYPE = np.float64
X_DTYPE = np.float64
X_BINNED_DTYPE = np.uint8  # hence max_bins == 256
# dtype for gradients and hessians arrays
G_H_DTYPE = np.float32
X_BITSET_INNER_DTYPE = np.uint32


PREDICTOR_RECORD_DTYPE_2 = np.dtype([
    ('value', Y_DTYPE),
    ('count', np.uint32),
    ('feature_idx', np.int32),
    ('num_threshold', X_DTYPE),
    ('missing_go_to_left', np.uint8),
    ('left', np.uint32),
    ('right', np.uint32),
    ('gain', Y_DTYPE),
    ('depth', np.uint32),
    ('is_leaf', np.uint8),
    ('bin_threshold', X_BINNED_DTYPE),
    ('is_categorical', np.uint8),
    # The index of the corresponding bitsets in the Predictor's bitset arrays.
    # Only used if is_categorical is True
    ('bitset_idx', np.uint32)
])

model  = joblib.load("/model.joblib")

for i,_ in enumerate(model._predictors):
    model._predictors[i][0].nodes = model._predictors[i][0].nodes.astype(PREDICTOR_RECORD_DTYPE_2)

model.predict(data)

Versions

python version 3.11.3 (main, May 15 2023, 10:43:03) [Clang 14.0.6 ]
sklearn version 1.3.1

System:
    python: 3.11.3 (main, May 15 2023, 10:43:03) [Clang 14.0.6 ]
executable: /Users/slynn/miniconda3/envs/demoland/bin/python
   machine: macOS-10.16-x86_64-i386-64bit

Python dependencies:
      sklearn: 1.3.1
          pip: 23.3
   setuptools: 68.0.0
        numpy: 1.25.2
        scipy: 1.11.3
       Cython: None
       pandas: 1.5.3
   matplotlib: None
       joblib: 1.3.2
threadpoolctl: 3.2.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
    num_threads: 10
         prefix: libomp
       filepath: /Users/slynn/miniconda3/envs/demoland/lib/python3.11/site-packages/sklearn/.dylibs/libomp.dylib
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /Users/slynn/miniconda3/envs/demoland/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: Nehalem

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /Users/slynn/miniconda3/envs/demoland/lib/python3.11/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.21.dev
threading_layer: pthreads
   architecture: Nehalem

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions