You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Numpy dtypes have a lenght defined. For normal dtypes, this is 0:
In [21]: len(np.dtype('int64'))
Out[21]: 0
but for structured dtypes this is the number of fields.
For our custom dtypes, you get a TypeError:
In [25]: s = pd.Series(['a', 'b', 'c']).astype('category')
In [26]: s.dtypes
Out[26]: CategoricalDtype(categories=['a', 'b', 'c'], ordered=False)
In [27]: len(s.dtypes)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-27-3d39f9ba3f74> in <module>
----> 1 len(s.dtypes)
TypeError: object of type 'CategoricalDtype' has no len()
In hindsight, using the len of the dtypes in sklearn was maybe not the most robust idea, but that said: should we follow numpy here and also define a __len__ on our custom dtypes?
(I am personally not fully sure it is needed)
The text was updated successfully, but these errors were encountered:
Yes, in any case we will fix this in sklearn, since we want it to work with released versions of pandas (PR scikit-learn/scikit-learn#12706).
Given that, I also don't see strong reasons to define it for pandas. The question is a bit how far we want to have them be compatible with numpy dtypes.
This came up in scikit-learn/scikit-learn#12699.
Numpy dtypes have a lenght defined. For normal dtypes, this is 0:
but for structured dtypes this is the number of fields.
For our custom dtypes, you get a TypeError:
In hindsight, using the len of the dtypes in sklearn was maybe not the most robust idea, but that said: should we follow numpy here and also define a
__len__
on our custom dtypes?(I am personally not fully sure it is needed)
The text was updated successfully, but these errors were encountered: