-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fix/26837 format array #44527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/26837 format array #44527
Conversation
Moves responsibility for converting EAs to a List[str] from pandas.io.formats to a method on the EA.
f551ecb
to
a7ec260
Compare
leading_space, | ||
quoting, | ||
) | ||
elif is_datetime64_dtype(values.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should check to see if these datetlike are actually hit anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah. i expect that if you use isinstance(values, ExtensionArray)
instead of is_extension_array_dtype
above you can get the dt64 and td64 cases at the same time
@@ -681,6 +685,43 @@ def _format_native_types( | |||
self.asi8, tz=self.tz, format=fmt, na_rep=na_rep | |||
) | |||
|
|||
def _format_array( | |||
self, | |||
formatter: Callable | None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not too familiar with how this gets reached. is e.g. formatter going to always be self._formatter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then can we do without the formatter arg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I was incorrect. formatter
might be a user-supplied callable in cases like df.to_html(formatter={"col": formatter})
.
|
||
from pandas.io.formats.format import format_array | ||
|
||
values = extract_array(self, extract_numpy=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would we need to extract_array? i guess for PandasArray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my guess too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check if its really necessary and if so, add a comment as to why
Can we use this to render unnecessary any of the other formatting-related methods out there? in particular im thinking of
|
I'm not sure, but it's not obvious to me. |
decimal: str = ".", | ||
leading_space: bool | None = True, | ||
quoting: int | None = None, | ||
) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment this seems like adding a lot of boilerplate that could be handled in the base class no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly different than the Categorical case. Categorical wants to change the values
passed to the fmt_klass
. This is actually changing the fmt_klass
itself.
We could add some method to the interface to get the formatting class for an array. I don't really think that we want to publicly expose the ArrayFormatter interface publicly though.
If it's purely about the lines of code here, we could have a "private" _fmt_klass
on our DatetimeArrays and check for that attribute, and use it in the base class.
# in ExtensionArray._format_array
if hasattr(self, "_format_class"):
fmt_klass = self._format_class
else:
fmt_klass = GenericArrayFormatter
I dunno. This is all kind of messy.
self, | ||
formatter: Callable | None, | ||
*, | ||
float_format: FloatFormatType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we take this opportunity to make a FormatOptionsType that has all of these defaults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain how this related to the existing _formatter
mechanism?
(here in the PR for reviewing purpose, but it will also need to explained in the implementer's notes in arrays/base.py I think)
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@TomAugspurger can you merge main |
Thanks for the PR, but it appears to have gone stale. Feel free to reopen when you have time to revisit. |
The basic strategy is to move
format_array
from pandas.io.formats.format toExtensionArray._format_array
. By default, we convert to ndarray and then use the existing formatting mechanism on the ndarray.