Skip to content

Design document for percent formatting #1068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

aphillips
Copy link
Member

@aphillips aphillips commented Apr 7, 2025

This document now includes the proposed design discussed in the (poorly documented, sparsely attended) 2025-05-19 call. The emerging consensus appears to be:

  • :number/:integer with option style=percent (this is the current design) with scaling
  • :unit with unit percent is REQUIRED but other units are not required, this function has NO scaling

The most recent commit posits that :number/:integer select after scaling because we don't support fraction selection currently.

Note

All previous conversations were marked resolved on purpose and not because their content was in any way deficient, off topic, or necessarily addressed. Please comment on the proposed design.

This document is focused for now on documenting the options.
@aphillips aphillips added design Design document or issues related to design functions Issue pertains to the default function set LDML48 LDML48 Release labels Apr 7, 2025
Comment on lines 145 to 146
Note that the selector selects on the scaled value
(selectors currently cannot select fractional parts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highlighting for @sffc and @ryzokuken that this may require supporting something like style as an Intl.PluralRules option, or being ok with Intl.MessageFormat offering a capability beyond Intl.PluralRules.

Is plural category selection on percent-formatted values already well supported by the ICU libraries?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if selection makes sense for :number style=percent, is there a reason to not allow selection for :unit unit=percent as well? And if the latter is fine, then why not allow selection for all of :unit.

My preference would be for us to not allow for selection with percent formatting at this time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may require supporting something like style as an Intl.PluralRules option

I don't think this necessarily follows? The function handler would scale the value before using plural rules, just as the function handler calls Intl.NumberFormat to format the number later.

My preference would be for us to not allow for selection with percent formatting at this time.

This would require a separate placeholder just to perform message selection on what is fundamentally a number.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may require supporting something like style as an Intl.PluralRules option

I don't think this necessarily follows? The function handler would scale the value before using plural rules, just as the function handler calls Intl.NumberFormat to format the number later.

Quite recently in tc39/ecma402#989 we added notation as one of the Intl.PluralRules options. It would therefore be a bit surprising to effectively support style=percent for selection scaling in Intl.MessageFormat, but not for Intl.PluralRules.

My preference would be for us to not allow for selection with percent formatting at this time.

This would require a separate placeholder just to perform message selection on what is fundamentally a number.

Does your position generalise to preferring selection to work on all :unit values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your position generalise to preferring selection to work on all :unit values?

I'm not sure. I think it might, at least in terms of exact selection. Plural rules don't work well/aren't needed (what's actually wanted is more like inflection). But there are plenty of cases where one might want a specific value to produce a specific message, with 0 being the most common case.

A lot of unit selection cases actually look like what ChoiceFormat is good at: separating messages above/below a given threshold, e.g. switching from kilometers to meters (or miles to feet) in driving directions at some cutoff.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy with the Proposed Design: Let's not add the style option back to :number. It was productive when we removed it, and I don't want to go back. Among other things, it causes problems when mixed with other options, questionable plural selection behavior as @eemeli noted, and inconsistent with Intl.NumberFormat which uses style for currency and unit as well.

I also do not like requiring :unit unit=percent, because we want percent formatting to not pull in all unit formatting data, but this is unavoidable in this design for reasons the ICU4X TC has expressed previously.


## Constraints

_What prior decisions and existing conditions limit the possible design?_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should mention that since MF1 scales, we want MF2 to be able to scale?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't necessarily want MF2 to scale. There is a hot debate about whether scaling or non-scaling is preferred.

@aphillips
Copy link
Member Author

(chair hat OFF)

Adding style=percent to :number would not prevent us from having a convenience function :percent later (or now). This suggests that we prefer a "lumping" vs. "splitting" design for functions, with at least one function with "all the options", possibly surrounded by a host of convenience functions. This is to some degree what we have now, e.g.:

  • :number with :integer and possibly :percent and other later functions as convenience
  • :datetime with :date, :time (and as many as 16 more) functions as convenience

I'm not opposed to adding convenience functions. I think :integer was exactly the right thing to do for our users.

Alternatively we could regard :percent as a separate function handler (including with the special selection logic related to scaling) with targeted options. This suggests that we might add other specialized functions in the future, although it doesn't require it. We also have this as a model:

  • :number has :currency as a (draft) friend. :number cannot format currency values (it can format a number from a currency, but not as a currency)

I think :unit unit=percent is a red herring. It exists because carving out percent from CLDR units is janky (especially given that per-mille and per-myriad exist). My guess is that no one will use it when there's a big shiny function :percent available. Having two baroque means ({$p :number style=percent} and {$p :unit unit=percent}) for formatting percentages is just weird: we have a common-enough use case and two different, equally-inconvenient means of formatting it?

My preferred solution is:

  • add :percent
  • do NOT add :number style=percent
  • do NOT require :unit unit=percent (but permit it)

(chair hat ON)

I observe that we're not closing on a design. If we do not achieve consensus on a design in the next (2025-06-02) call, I will call for a ballot.

@eemeli
Copy link
Collaborator

eemeli commented Jun 8, 2025

I talked with @sffc while we met in person this past week, and it came up that ICU4X would almost certainly want unit formatting to be separated by category. As in, when formatting a value that includes its own unit (say, kilometer), the expression would need to define at least length as the category of supported units. The rationale here is to limit the data loading that would be required for the formatter before it can tell exactly which unit it'll be formatting.

If that is a requirement that we accept, then it suggests to me that we ought to include the category in the function name, so we'd have e.g. :unit:length, :unit:volume and so on, rather than a catch-all :unit. With such an approach, we ought to consider a dedicated :unit:percent, and if so, promote its use rather than adding a style=percent on :number.

@aphillips
Copy link
Member Author

@eemeli suggested:

If that is a requirement that we accept, then it suggests to me that we ought to include the category in the function name, so we'd have e.g. :unit:length, :unit:volume and so on, rather than a catch-all :unit. With such an approach, we ought to consider a dedicated :unit:percent, and if so, promote its use rather than adding a style=percent on :number.

So :unit would be a namespace? We don't permit nested namespaces, so that would limit implementation-specific extension. Perhaps use an unreserved sigil as a separator instead, e.g. :unit-length, :unit-volume, :unit-percent. Only, once we do that, the unit- part starts to look superfluous. What's the difference between :unit-percent and :percent? Similarly, why type :unit-length instead of :length?

The "requirement" is really a dodge around creating separate functions for each unit or around creating a single function whose data loading depends on an option (or on the operand value). Most implementations bind the unit data late, but we should allow for ICU4X and its need/desire to bind the data early.

@macchiati
Copy link
Member

I disagree completely.

By "category" I assume what is meant is "quantity" (from SI, with a few other special cases). See https://www.unicode.org/cldr/charts/48/supplemental/unit_conversions.html

There are downsides of the mentioned approach (unit formatting to be separated by category)

  • There are many quantities: the message writer has the burden of looking up the exact quantity being formatted, as well as the unit.
  • There are many possible units that don't have an SI quality. For example, farad-per-square-second.
  • The contents of each "chunk" that a memory-constrained implementation (like ICU4X) needs for it to minimize loading may well not be aligned with quantities — so best left to that implementation.
  • For any particular unit being formatted, something like a quantity is fairly straightforward to look up, with a small amount of data.

So I don't think it is at all justified to jump through hoops by requiring categories.

Note: the heading at the top of the chart needs some tweaks. For example, it doesn't mention beaufort, which has a more complex conversion than just factor & offset.

@sffc
Copy link
Member

sffc commented Jun 9, 2025

The current design of :unit is not implementable with ICU4X's data design for reasons laid out in #1006. We did not take too close of a look at :unit when writing that doc because :unit was not marked as being required, but it suffers from many of the same problems as, for example, u:locale. As a result, requiring "part" of :unit for percent formatting is not feasible.

@eemeli's suggestion of :unit:percent or :unit:length would mitigate this problem.

There are many quantities, but I would be happy enough splitting out the most important ones and throwing everything else into :unit:other or something.

@macchiati
Copy link
Member

The current design of :unit is not implementable with ICU4X's data design for reasons laid out in #1006.

I looked at #1006 and didn't find a discussion of unit. I am strongly opposed to requiring quantities with unit ids.

What might work is for unit is to only require a small subset of unit ids to be supported. Then ICU4X and other memory-limited implementations could choose to only support the required set. But that is really a separate issue.

aphillips and others added 2 commits June 9, 2025 06:38
Co-authored-by: Tim Chevalier <tjc@igalia.com>
Co-authored-by: Tim Chevalier <tjc@igalia.com>
@sffc
Copy link
Member

sffc commented Jun 9, 2025

The current design of :unit is not implementable with ICU4X's data design for reasons laid out in #1006.

I looked at #1006 and didn't find a discussion of unit.

We did not take too close of a look at :unit when writing that doc because :unit was not marked as being required, but it suffers from many of the same problems as, for example, u:locale.

@sffc
Copy link
Member

sffc commented Jun 9, 2025

I'm spinning off the :unit discussion into #1079

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on today's call, I've moved the proposed solution to be one of the alternatives under consideration.

The intent with this change is to allow us to land this PR, and to iterate further from there. As discussed today, adding a separate :percent function might be the least worst option here, and it would match what we're doing with unit and currency formatting.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like landing this with all the alternatives listed as alternatives and not taking a position on which one we want to pick.

Co-authored-by: Shane F. Carr <sffc@google.com>
@eemeli eemeli requested a review from sffc July 15, 2025 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design document or issues related to design functions Issue pertains to the default function set LDML48 LDML48 Release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants