-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-87790: support thousands separators for formatting fractional part of floats #125304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…floats ```pycon >>> f"{123_456.123_456:_._f}" # Whole and fractional '123_456.123_456' >>> f"{123_456.123_456:_f}" # Integer component only '123_456.123456' >>> f"{123_456.123_456:._f}" # Fractional component only '123456.123_456' >>> f"{123_456.123_456:.4_f}" # with precision '123456.1_235' ```
b648865
to
5858d8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(JFR, review request on d.p.o: https://discuss.python.org/t/71229) |
|
||
/* Determine the grouping, separator, and decimal point, if any. */ | ||
if (get_locale_info(format->type == 'n' ? LT_CURRENT_LOCALE : | ||
format->thousands_separators, | ||
format->frac_thousands_separator, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are we going to do with the locale? At the very least I would reject specifying the separator manually:
>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'af_za')
>>> print(f"{123_456.123_456:.12,n}")
123.456,123,456 # <- 2 decimal points
Would it be a breaking change to do this?
>>> print(f"{123_456.123_456:.12n}")
123.456,123.456
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just reject separators with the n
type.
Edit:
Would it be a breaking change to do this?
Yes, this will be a compatibility break.
IMO, without a legacy - it would be better to use thousands separators both for integer and fractional part. Then we could do same with n
type formatting.
Edit2:
Maybe we should try something more closer to the original implementation? I.e. with a distinct optional frac_grouping
option, which can have _
value. If present:
- for floating-point format types that means: "apply current settings for thousands separators both to integer and fractional part";
- for
n
type that means: "apply current locale settings for thousands separators both to integer and fractional part".
This is less flexible (you can't put separators to the fractional part only), but will more consistently interact with locale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible to use the underscore separator with n
format:
>>> format(1234, "_n")
ValueError: Cannot specify '_' with 'n'.
So I agree that we should reject separators with the n
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See invalid_thousands_separator_type() in parse_internal_render_format_spec().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vstinner, this check already added. Unfortunately, that means with n
type we can have locale-dependent separators only for integer part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, should we allow different separators? So far it's permitted:
>>> f"{122222.22222:_.7,f}"
'122_222.222,220,0'
Misc/NEWS.d/next/Core_and_Builtins/2024-10-11-10-41-05.gh-issue-87790.mlfEGl.rst
Show resolved
Hide resolved
|
||
/* Determine the grouping, separator, and decimal point, if any. */ | ||
if (get_locale_info(format->type == 'n' ? LT_CURRENT_LOCALE : | ||
format->thousands_separators, | ||
format->frac_thousands_separator, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not possible to use the underscore separator with n
format:
>>> format(1234, "_n")
ValueError: Cannot specify '_' with 'n'.
So I agree that we should reject separators with the n
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the multiple updates.
@serhiy-storchaka: Would you mind to review this change? |
Doc/library/string.rst
Outdated
@@ -312,7 +312,7 @@ non-empty format specification typically modifies the result. | |||
The general form of a *standard format specifier* is: | |||
|
|||
.. productionlist:: format-spec | |||
format_spec: [[`fill`]`align`][`sign`]["z"]["#"]["0"][`width`][`grouping_option`]["." `precision`][`type`] | |||
format_spec: [[`fill`]`align`][`sign`]["z"]["#"]["0"][`width`][`grouping_option`]["." `precision` [`grouping_option`]][`type`] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not accurate. More accurate would be ["." `precision` [`grouping_option`] | "." `grouping_option`]
or ["." (`precision` [`grouping_option`] | `grouping_option`)]
or ["." [`precision`][`grouping_option`]]
with addition that either precision or grouping_option should be specified after ".".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this:
.. productionlist:: format-spec
format_spec: [`options`][`width_and_precision`][`type`]
options: [[`fill`]`align`][`sign`]["z"]["#"]["0"]
fill: <any character>
align: "<" | ">" | "=" | "^"
sign: "+" | "-" | " "
width_and_precision: [`width_with_grouping`][`precision_with_grouping`]
width_with_grouping: [`width`][`grouping_option`]
precision_with_grouping: "." (`precision`[`grouping_option`] | `grouping_option`)
width: `~python-grammar:digit`+
grouping_option: "_" | ","
precision: `~python-grammar:digit`+
type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g"
: | "G" | "n" | "o" | "s" | "x" | "X" | "%"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant [...]
in precision_with_grouping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
pos++; | ||
} | ||
|
||
while (pos<end && Py_ISDIGIT(PyUnicode_READ(kind, data, pos))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in the if block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, which if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this the way we will go, the implementation LGTM.
@vstinner, @serhiy-storchaka should we ask someone else to review this pr? |
I merged your PR, thanks. |
…l part of floats (python#125304) ```pycon >>> f"{123_456.123_456:_._f}" # Whole and fractional '123_456.123_456' >>> f"{123_456.123_456:_f}" # Integer component only '123_456.123456' >>> f"{123_456.123_456:._f}" # Fractional component only '123456.123_456' >>> f"{123_456.123_456:.4_f}" # with precision '123456.1_235' ```
Notes for reviewers.
This is a simplest version, usingSecond commit puts separators, iterating from the least significand digit of fractional part._PyUnicode_InsertThousandsGrouping()
like for the integer part. But perhaps we should insert digits in the fractional part, starting from the least significand digit instead, i.e. last example will print123456.123_5
instead.._3f
- for grouping in three digits)._f
will be "use underscore's both in the integer and the fractional part".Also comma as a separator? (Perhaps, it's better to do in a separate pr.)📚 Documentation preview 📚: https://cpython-previews--125304.org.readthedocs.build/