Skip to content

Different prediction of tree classifier between 32 bits and 64 bits #8853

@glemaitre

Description

@glemaitre

Description

While having some test failing in imbalanced-learn with a 32 bits machine (we are kinda poor ;)), we got to the point that the DecisionTreeClassifier can return different results between a 32 and a 64 architecture.

Steps/Code to Reproduce

import numpy as np                                                                                                                                                                                                               
from sklearn.tree import DecisionTreeClassifier                                                                      
                                                                                                                     
RND_SEED = 0                                                                                                         
X = np.array([[-0.3879569, 0.6894251], [-0.09322739, 1.28177189],                                                    
              [-0.77740357, 0.74097941], [0.91542919, -0.65453327],                                                  
              [-0.03852113, 0.40910479], [-0.43877303, 1.07366684],                                                  
              [-0.85795321, 0.82980738], [-0.18430329, 0.52328473],                                                  
              [-0.30126957, -0.66268378], [-0.65571327, 0.42412021],                                                 
              [-0.28305528, 0.30284991], [0.20246714, -0.34727125],                                                  
              [1.06446472, -1.09279772], [0.30543283, -0.02589502],                                                  
              [-0.00717161, 0.00318087]])                                                                            
y = np.array([0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0])                                                          
                                                                                                                     
train_index = [0, 1, 2, 3, 6, 7, 8, 10, 11, 12, 13, 14]                                                              
test_index = [4, 5, 9]                                                                                               
                                                                                                                     
clf = DecisionTreeClassifier(random_state=0)                                                                         
clf.fit(X[train_index], y[train_index])                                                                              
proba = clf.predict_proba(X[test_index])                                                                             
print(proba) 

Expected Results

This is the result obtained in the 64 bits machine:

[[ 1. 0.]
[ 0. 1.]
[ 1. 0.]]

Actual Results

This is the result obtained for the 32 bits machine:

[[ 1. 0.]
[ 0. 1.]
[ 0. 1.]]

Versions

Ubuntu 16.04 - 32/64 bits
Python 2.7.13
numpy(1.12.1) ; scipy (0.19) ; scikit-learn (0.18.1) installed from conda

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions