Skip to content

CLN: use idiomatic pandas_dtypes in pandas/dtypes/common.py #24541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 43 commits into from
Jan 4, 2019

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jan 2, 2019

closes #24593

Some benchmarks of pandas_dtype construction from a dtype object & strings, only thing slightly suprising is Period[D]

[ 87.50%] ··· ============================================ =============
                                 dtype                                  
              -------------------------------------------- -------------
                             dtype('int64')                   463±20ns  
                             dtype('int32')                   450±20ns  
                            dtype('uint32')                   444±4ns   
                            dtype('uint64')                   484±20ns  
                            dtype('float32')                  494±30ns  
                            dtype('float64')                  507±30ns  
                             dtype('int16')                   471±30ns  
                             dtype('int8')                    506±30ns  
                            dtype('uint16')                   505±40ns  
                             dtype('uint8')                   485±20ns  
                              dtype('<M8')                    451±20ns  
                              dtype('<m8')                    480±20ns  
                               dtype('O')                    627±200ns  
                  pandas.core.arrays.integer.Int8Dtype        995±60ns  
                 pandas.core.arrays.integer.Int16Dtype        944±60ns  
                 pandas.core.arrays.integer.Int32Dtype        939±70ns  
                 pandas.core.arrays.integer.Int64Dtype        988±30ns  
                 pandas.core.arrays.integer.UInt8Dtype        924±50ns  
                 pandas.core.arrays.integer.UInt16Dtype       954±60ns  
                 pandas.core.arrays.integer.UInt32Dtype       966±70ns  
                 pandas.core.arrays.integer.UInt64Dtype     1.00±0.06μs 
               pandas.core.dtypes.dtypes.CategoricalDtype     978±30ns  
                pandas.core.dtypes.dtypes.IntervalDtype      929±300ns  
                          datetime64[ns, UTC]               1.52±0.03μs 
                               period[D]                      958±9ns   
                                 int64                       16.1±0.1μs 
                                 int32                       16.1±0.6μs 
                                 uint32                      16.0±0.1μs 
                                 uint64                      16.2±0.3μs 
                                float32                      15.8±0.4μs 
                                float64                      15.7±0.6μs 
                                 int16                       16.0±0.2μs 
                                  int8                      15.7±0.06μs 
                                 uint16                     15.9±0.08μs 
                                 uint8                       17.3±0.6μs 
                               datetime64                    16.2±0.3μs 
                              timedelta64                    16.0±0.1μs 
                                 object                      16.0±0.2μs 
                                  Int8                       6.24±0.1μs 
                                 Int16                       7.41±0.2μs 
                                 Int32                      8.15±0.04μs 
                                 Int64                       9.39±0.1μs 
                                 UInt8                       10.1±0.1μs 
                                 UInt16                     11.0±0.04μs 
                                 UInt32                      11.8±0.5μs 
                                 UInt64                     12.7±0.08μs 
                                category                    2.62±0.03μs 
                                interval                     6.72±0.2μs 
                          datetime64[ns, UTC]                2.93±0.1μs 
                               period[D]                     52.6±0.6μs 
              ============================================ =============

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Clean labels Jan 2, 2019
@jreback jreback added this to the 0.24.0 milestone Jan 2, 2019
@jreback
Copy link
Contributor Author

jreback commented Jan 2, 2019

cc @TomAugspurger @jbrockmendel

should be orthogonal to anything you are doing.

@codecov
Copy link

codecov bot commented Jan 2, 2019

Codecov Report

Merging #24541 into master will decrease coverage by <.01%.
The diff coverage is 98.38%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24541      +/-   ##
==========================================
- Coverage   92.32%   92.32%   -0.01%     
==========================================
  Files         166      166              
  Lines       52440    52368      -72     
==========================================
- Hits        48417    48349      -68     
+ Misses       4023     4019       -4
Flag Coverage Δ
#multiple 90.74% <98.38%> (-0.01%) ⬇️
#single 42.98% <80.64%> (-0.03%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/numeric.py 97.32% <100%> (ø) ⬆️
pandas/core/internals/construction.py 96.67% <100%> (ø) ⬆️
pandas/core/dtypes/common.py 96.2% <98.24%> (+0.57%) ⬆️
pandas/util/testing.py 87.59% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf92230...053c14a. Read the comment docs.

@codecov
Copy link

codecov bot commented Jan 2, 2019

Codecov Report

Merging #24541 into master will decrease coverage by <.01%.
The diff coverage is 98.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24541      +/-   ##
==========================================
- Coverage   92.38%   92.38%   -0.01%     
==========================================
  Files         166      166              
  Lines       52490    52396      -94     
==========================================
- Hits        48493    48404      -89     
+ Misses       3997     3992       -5
Flag Coverage Δ
#multiple 90.8% <98.68%> (-0.01%) ⬇️
#single 42.98% <71.05%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 96.92% <ø> (ø) ⬆️
pandas/core/dtypes/concat.py 96.6% <100%> (-0.04%) ⬇️
pandas/core/arrays/integer.py 96.32% <100%> (+0.02%) ⬆️
pandas/core/dtypes/cast.py 88.72% <100%> (ø) ⬆️
pandas/core/indexes/numeric.py 97.32% <100%> (ø) ⬆️
pandas/core/internals/concat.py 96.45% <100%> (-0.37%) ⬇️
pandas/core/internals/construction.py 96.68% <100%> (ø) ⬆️
pandas/core/dtypes/common.py 96.78% <98.43%> (+1.66%) ⬆️
pandas/core/internals/blocks.py 94.21% <0%> (-0.33%) ⬇️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6249355...7d4bd5e. Read the comment docs.

@jreback jreback force-pushed the dtypes branch 2 times, most recently from c6c3964 to a19cdfc Compare January 2, 2019 03:20
@jreback
Copy link
Contributor Author

jreback commented Jan 2, 2019

@jbrockmendel if any comments

@jbrockmendel
Copy link
Member

A few comments, one possible request for a test. Otherwise LGTM.

@jreback
Copy link
Contributor Author

jreback commented Jan 2, 2019

updated

@jreback
Copy link
Contributor Author

jreback commented Jan 2, 2019

i have a couple of more changes here.

@jbrockmendel
Copy link
Member

ill take a look

@jreback
Copy link
Contributor Author

jreback commented Jan 3, 2019

ok this is ready. @jorisvandenbossche @jbrockmendel

@jbrockmendel
Copy link
Member

LGTM.

The part I'm least familiar with is the IntNA, so grain of salt.

Side-note: will this fix these warnings I get in the test logs?

pandas/core/dtypes/common.py:1864
  [...]pandas/core/dtypes/common.py:1864: DeprecationWarning: Numeric-style type codes are deprecated and will result in an error in the future.
    return _get_dtype_type(np.dtype(arr_or_dtype))

@jreback
Copy link
Contributor Author

jreback commented Jan 3, 2019

yes those warnings are gone

@TomAugspurger
Copy link
Contributor

So I think this closes #24593 then?

@jreback
Copy link
Contributor Author

jreback commented Jan 3, 2019

yes this closes #24593 as well. will address comments later today.

@jorisvandenbossche
Copy link
Member

yes this closes #24593 as wel

There are still a bunch of those deprecation warnings in the last travis build in this PR (although maybe coming from another place, that I didn't check. But strictly speaking not yet closing ##24593 completely)

@jreback
Copy link
Contributor Author

jreback commented Jan 3, 2019

@jorisvandenbossche where? #24541 (comment)

@jorisvandenbossche
Copy link
Member

Eg https://travis-ci.org/pandas-dev/pandas/jobs/474964674 (first entry of the last green travis build at the time I commented before) still has some "eprecationWarning: Numeric-style type codes are deprecated and will result in an error in the future." warnings in the log

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2019

I updated the whatsnew about #21681 as well.

@jreback
Copy link
Contributor Author

jreback commented Jan 4, 2019

ok warnings fixed up. Basically we were missing paths in core/internals/concat.py and core/dtypes/conat.py for EA operations.

@@ -139,7 +139,7 @@ def lfilter(*args, **kwargs):
Hashable = collections.abc.Hashable
Iterable = collections.abc.Iterable
Mapping = collections.abc.Mapping
MutableMapping = collections.abc.MutableMapping
MutableMapping = collections.MutableMapping
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one with abc included was actually correct? (CI is failing now due the warning this raises)

@jreback jreback merged commit 19f715c into pandas-dev:master Jan 4, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

COMPAT: IntegerXXDtypes string repr cause numpy deprecation warnings
4 participants