Fix/26837 format array #44527

TomAugspurger · 2021-11-19T18:12:55Z

closes Refactor format_array in pandas\io\formats\format.py #26837
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

The basic strategy is to move format_array from pandas.io.formats.format to ExtensionArray._format_array. By default, we convert to ndarray and then use the existing formatting mechanism on the ndarray.

Moves responsibility for converting EAs to a List[str] from pandas.io.formats to a method on the EA.

pandas/core/arrays/base.py

pandas/core/indexes/datetimelike.py

TomAugspurger · 2021-11-19T18:16:13Z

pandas/io/formats/format.py

+            leading_space,
+            quoting,
+        )
+    elif is_datetime64_dtype(values.dtype):


I should check to see if these datetlike are actually hit anymore.

yah. i expect that if you use isinstance(values, ExtensionArray) instead of is_extension_array_dtype above you can get the dt64 and td64 cases at the same time

pandas/tests/extension/test_format.py

jbrockmendel · 2021-11-20T00:10:52Z

pandas/core/arrays/datetimes.py

@@ -681,6 +685,43 @@ def _format_native_types(
            self.asi8, tz=self.tz, format=fmt, na_rep=na_rep
        )

+    def _format_array(
+        self,
+        formatter: Callable | None,


im not too familiar with how this gets reached. is e.g. formatter going to always be self._formatter?

I think you're right.

then can we do without the formatter arg?

Sorry I was incorrect. formatter might be a user-supplied callable in cases like df.to_html(formatter={"col": formatter}).

pandas/core/arrays/datetimes.py

pandas/tests/extension/test_format.py

pandas/core/arrays/categorical.py

…array

pandas/core/arrays/base.py

jbrockmendel · 2021-11-24T21:47:42Z

pandas/core/arrays/base.py

+
+        from pandas.io.formats.format import format_array
+
+        values = extract_array(self, extract_numpy=True)


why would we need to extract_array? i guess for PandasArray?

That's my guess too.

can you check if its really necessary and if so, add a comment as to why

pandas/tests/extension/test_common.py

jbrockmendel · 2021-11-24T21:55:44Z

Can we use this to render unnecessary any of the other formatting-related methods out there? in particular im thinking of

DTA/TDA/PA._format_native_types
DTI/TDI/PI._format_with_header (matches Index._format_with_header but with different na_rep and extra kwarg)
IntervalIndex._format_native_types (matches Index._format_native_types but with different default value for na_rep)
IntervalIndex._format_data (# TODO: integrate with categorical and make generic)

TomAugspurger · 2021-11-24T22:17:49Z

I'm not sure, but it's not obvious to me. _format_array is designed to be generic. I would assume those are more specific and rely on some structure that can't be assumed in _format_array.

pandas/core/arrays/categorical.py

jreback · 2021-11-25T16:54:11Z

pandas/core/arrays/datetimes.py

+        decimal: str = ".",
+        leading_space: bool | None = True,
+        quoting: int | None = None,
+    ) -> list[str]:


same comment this seems like adding a lot of boilerplate that could be handled in the base class no?

This is slightly different than the Categorical case. Categorical wants to change the values passed to the fmt_klass. This is actually changing the fmt_klass itself.

We could add some method to the interface to get the formatting class for an array. I don't really think that we want to publicly expose the ArrayFormatter interface publicly though.

If it's purely about the lines of code here, we could have a "private" _fmt_klass on our DatetimeArrays and check for that attribute, and use it in the base class.

# in ExtensionArray._format_array if hasattr(self, "_format_class"): fmt_klass = self._format_class else: fmt_klass = GenericArrayFormatter

I dunno. This is all kind of messy.

…array

jreback · 2021-11-28T20:32:14Z

pandas/core/arrays/base.py

+        self,
+        formatter: Callable | None,
+        *,
+        float_format: FloatFormatType,


should we take this opportunity to make a FormatOptionsType that has all of these defaults?

jorisvandenbossche

Can you explain how this related to the existing _formatter mechanism?
(here in the PR for reviewing purpose, but it will also need to explained in the implementer's notes in arrays/base.py I think)

github-actions · 2022-01-01T01:34:44Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

jreback · 2022-01-16T19:12:38Z

@TomAugspurger can you merge main

mroeschke · 2022-03-06T01:23:28Z

Thanks for the PR, but it appears to have gone stale. Feel free to reopen when you have time to revisit.

Tom Augspurger added 2 commits November 19, 2021 11:25

Refactor EA formatting

3885851

Moves responsibility for converting EAs to a List[str] from pandas.io.formats to a method on the EA.

added specific test

a7ec260

TomAugspurger force-pushed the fix/26837-format-array branch from f551ecb to a7ec260 Compare November 19, 2021 18:13

TomAugspurger commented Nov 19, 2021

View reviewed changes

pandas/core/arrays/base.py Outdated Show resolved Hide resolved

TomAugspurger commented Nov 19, 2021

View reviewed changes

cleanup

6a03c9a

jbrockmendel reviewed Nov 20, 2021

View reviewed changes

pandas/core/arrays/datetimes.py Outdated Show resolved Hide resolved

jreback added Output-Formatting __repr__ of pandas objects, to_string Refactor Internal refactoring of code labels Nov 20, 2021

jreback requested changes Nov 20, 2021

View reviewed changes

pandas/tests/extension/test_format.py Outdated Show resolved Hide resolved

pandas/core/arrays/categorical.py Show resolved Hide resolved

Tom Augspurger added 7 commits November 20, 2021 14:12

fixup

f84867b

docstring

9bcd61e

Merge remote-tracking branch 'upstream/master' into fix/26837-format-…

83cfe06

…array

added to API reference

e33b675

mypy fixup

42ea58b

fixed keyword only

2c8bceb

Merge remote-tracking branch 'upstream/master' into fix/26837-format-…

e1f2cb3

…array

jorisvandenbossche added the ExtensionArray Extending pandas with custom dtypes or arrays. label Nov 22, 2021

code checks

8dcb3e5

jbrockmendel reviewed Nov 24, 2021

View reviewed changes

pandas/core/arrays/base.py Outdated Show resolved Hide resolved

jbrockmendel reviewed Nov 24, 2021

View reviewed changes

pandas/tests/extension/test_common.py Outdated Show resolved Hide resolved

fixup

04f590b

jreback requested changes Nov 25, 2021

View reviewed changes

Merge remote-tracking branch 'upstream/master' into fix/26837-format-…

fb063f5

…array

jreback requested changes Nov 28, 2021

View reviewed changes

jorisvandenbossche reviewed Dec 1, 2021

View reviewed changes

github-actions bot added the Stale label Jan 1, 2022

mroeschke closed this Mar 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/26837 format array #44527

Fix/26837 format array #44527

TomAugspurger commented Nov 19, 2021 •

edited

Loading

TomAugspurger Nov 19, 2021

jbrockmendel Nov 20, 2021

jbrockmendel Nov 20, 2021

TomAugspurger Nov 20, 2021

jbrockmendel Nov 24, 2021

TomAugspurger Nov 27, 2021

jbrockmendel Nov 24, 2021

TomAugspurger Nov 24, 2021

jbrockmendel Nov 27, 2021

jbrockmendel commented Nov 24, 2021

TomAugspurger commented Nov 24, 2021

jreback Nov 25, 2021

TomAugspurger Nov 27, 2021

jreback Nov 28, 2021

jorisvandenbossche left a comment

github-actions bot commented Jan 1, 2022

jreback commented Jan 16, 2022

mroeschke commented Mar 6, 2022


		from pandas.io.formats.format import format_array

		values = extract_array(self, extract_numpy=True)

Fix/26837 format array #44527

Fix/26837 format array #44527

Conversation

TomAugspurger commented Nov 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 24, 2021

TomAugspurger commented Nov 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 1, 2022

jreback commented Jan 16, 2022

mroeschke commented Mar 6, 2022

TomAugspurger commented Nov 19, 2021 •

edited

Loading