Skip to content

Currency and unit conformance #1071

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aphillips
Copy link
Member

Fixes #1009

Makes currency codes case-insensitive more clearly. We used a squidgy "SHOULD" before in spite of the ABNF provided.

Make units either CLDR identifiers or impl-defined types (SHOULD->MUST).

This is consistent with the stability policy (no new error is specified, no valid message becomes invalid, and :unit is DRAFT)

Fixes #1009 

Makes currency codes case-insensitive more clearly.

Make units either CLDR identifiers or impl-defined types (SHOULD->MUST)
@aphillips aphillips added functions Issue pertains to the default function set normative Issue affects normative text in the specification labels Apr 14, 2025
@@ -257,8 +257,7 @@ Using this _option_ in such a case results in a _Bad Option_ error.
The value of the _operand_'s `currency` MUST be either a string containing a
well-formed [Unicode Currency Identifier](https://unicode.org/reports/tr35/tr35.html#UnicodeCurrencyIdentifier)
or an implementation-defined currency type.
Although currency codes are expected to be uppercase,
implementations SHOULD treat them in a case-insensitive manner.
Currency codes are case-insensitive.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves it unclear what the resolved value is. Could we be more explicit?

Suggested change
Currency codes are case-insensitive.
The resolved `currency` value is normalized to upper-case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reminder, even ASCII letters have locale-sensitive case folding exceptions (Turkic languages).

Do we need to specify that the resolved value is uppercased? Mostly we treat the operands as immutable. We don't have to require that implementations perform the case fold operation. It's a good idea to do it to save later processing, but we can't really test it at the MF level except to say that a value with usd is as functional as one with USD.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should specify the casing, so that it's clear in a message like

.input {$n :currency currency=Usd}
{{{$n :foo:bar}}}

what currency value the :foo:bar function handler will see.

Suggested change
Currency codes are case-insensitive.
The resolved `currency` value is locale-insensitively normalized to upper-case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a variety of quibbles with the suggestion, starting with not using the defined term resolved value. Also, we allow implementation-defined (currency) types in this option also. Since the literals must be ASCII letters, we can use Unicode's casefold algorithm. Perhaps:

Suggested change
Currency codes are case-insensitive.
Currency codes are case-insensitive.
If the value of the `currency` _option_ is a _literal_, the _resolved value_ MUST be case-folded to uppercase.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why limit the case-folding to literal values only? Why should the resolved value be different for $a vs $b?

.local $cc = {eur}
.input {$a :currency currency=eur}
.input {$b :currency currency=$cc}
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value would have been resolved before reaching the function. So it is either resolved as a literal or as an implementation-defined type. We can't uppercase a Currency object, but we can on a literal. Note that your examples are in the MF processor space, but this directive is inside the function handler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently have any place in the spec that defines specific behaviour based on whether a value originates from a literal, and I would very much prefer not adding anything like that. The restriction on :number select makes sense and is only about the immediate option value in part to avoid needing to track the origins of variable values like this.

Given that we say that

The resolved value of a text or a literal contains the character sequence of the text or literal after any character escape has been converted to the escaped character.

The definition here ought to limit its impact not on whether the option's resolved value came from a literal, but on whethe or not it contains a character sequence.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currency codes are case-insensitive.
Currency codes are case-insensitive.
If the _resolved value_ of the `currency` option is a character sequence, then the character sequence MUST be case-folded to uppercase.

How about something like that? This still leaves it unclear whether the message formatter is supposed to do the case-folding before calling the currency function handler, or whether the spec is saying that the currency function handler MUST do the case-folding itself. (Presumably the latter is what's intended.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just require that the currency is upper-case, and then we don't need to worry about the case folding stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, "case-folded to uppercase" doesn't make sense. Unicode Case Mapping defines algorithms for Uppercase, Lowercase, Titlecase, and Case Fold. You don't "case fold to uppercase".

@@ -257,8 +257,7 @@ Using this _option_ in such a case results in a _Bad Option_ error.
The value of the _operand_'s `currency` MUST be either a string containing a
well-formed [Unicode Currency Identifier](https://unicode.org/reports/tr35/tr35.html#UnicodeCurrencyIdentifier)
or an implementation-defined currency type.
Although currency codes are expected to be uppercase,
implementations SHOULD treat them in a case-insensitive manner.
Currency codes are case-insensitive.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currency codes are case-insensitive.
Currency codes are case-insensitive.
If the _resolved value_ of the `currency` option is a character sequence, then the character sequence MUST be case-folded to uppercase.

How about something like that? This still leaves it unclear whether the message formatter is supposed to do the case-folding before calling the currency function handler, or whether the spec is saying that the currency function handler MUST do the case-folding itself. (Presumably the latter is what's intended.)

@mihnita
Copy link
Collaborator

mihnita commented Apr 28, 2025

the resolved value MUST be case-folded to uppercase.

I don't think we should say this.

Most of the resolving work is done by the MF proper.
And at some point the function is invoked.

In order for the "MF engine" to resolve currencies by upper-casing them, it means that the engine should know about the :currency function, and about the rules governing it's options (in this case currency).

That is not clean design, and it is unnecessary.

The :currency function should recognize the currency codes regardless of case.

We can say that the parts returned by the :currency function (when we format to parts) are normalized to uppercase.
But not "resolving" in general.

@@ -257,8 +257,7 @@ Using this _option_ in such a case results in a _Bad Option_ error.
The value of the _operand_'s `currency` MUST be either a string containing a
well-formed [Unicode Currency Identifier](https://unicode.org/reports/tr35/tr35.html#UnicodeCurrencyIdentifier)
or an implementation-defined currency type.
Although currency codes are expected to be uppercase,
implementations SHOULD treat them in a case-insensitive manner.
Currency codes are case-insensitive.
Copy link
Collaborator

@catamorphism catamorphism Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currency codes are case-insensitive.
Currency codes are case-insensitive.
If the _resolved value_ of the `currency` option is a character sequence, then the character sequence MUST be treated within the function handler as if it had been converted to uppercase. The function handler MUST reproduce the original option value in the _resolved value_ that it returns (not the case-folded version). For example, the resolved value returned by the `currency` function handler for `{$a :currency currency=eur}` MUST have a `resolvedOptions()` method that returns an option mapping including a mapping from `currency` to the character sequence `"eur"` (not `"EUR"`.)

New suggestion based on today's discussion. A bit verbose, but feel free to bikeshed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super-verbose way of saying: currency code are case-insensitive? 😉

If we wanted to, we might say more directly that "currency codes MUST be processed in a case-insensitive manner"

I read the discussion in the notes. If I followed that correctly, the consensus is that the resolved value of options want to be about as immutable as the operand and we are only specifying what implementations of :currency/:unit must do with the contents of a currency option, no?

I would do that as:

Suggested change
Currency codes are case-insensitive.
Currency codes are case-insensitive.
If the _resolved value_ of the `currency` _option_ is a character sequence,
the implementation MUST treat it as if it had been
[ASCII uppercased](https://infra.spec.whatwg.org/#ascii-uppercase).
(The _resolved value_ of the _option_ is not modified.)

We definitely do not want to specify a "resolvedOptions() method" because we have gone out of our way not to specify the specific APIs or data structures.

@mihnita noted:

We can say that the parts returned by the :currency function (when we format to parts) are normalized to uppercase.
But not "resolving" in general.

Note that all of this processes takes place inside the function handler and not at the MF level. Since the group agrees that the result of case mapping should not be visible externally, I agree that we should stay away from resolved value altogether. If it were visible, then we'd need to say that.

Side note: in the DSO pull request, we say the implementation can replace too-large or too-small digit size option values. Do we mean "only internally to the function" there too? Or do those surface in the resolved options on the operand?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely do not want to specify a "resolvedOptions() method" because we have gone out of our way not to specify the specific APIs or data structures.

That's a good point, and is a reminder about a bigger issue: we need to decide whether to make the resolved-value interface normative.

It's hard for me to see how to re-word this text given that it's non-normative. "The resolved value of the option is not modified" isn't sufficient, IMO, because that's always true (option values are immutable). What we want to say is that in the data structure produced by the function handler, the value of the currency option is the same as the value that was specified in the function expression. And without committing to what that data structure looks like, I don't know how to express that.

I think this will come up with specifications of other functions, not just currency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets back to a fundamental issue. When we compose functions, what will users expect? Take the following example:

.local x = {$in :func1 opt1=oval1 opt2=oval2}
.local y = {$in :func2            opt2=oval2x opt3=oval3}

There is a tension between simplicity/predictability and capability. Sometimes making something more capable thereby reduces the simplicity and predictability. We can do that, but only if the benefit outweighs the cost.

In my mind, the simplest and most predictable behavior would be that for all functions and options, the above behaves precisely like:

.local x = {$in :func1 opt1=oval1 opt2=oval2}
.local y = {$in :func2 opt1=oval1 opt2=oval2x opt3=oval3}

except that :func2 must ignore the implicit opt1=oval1 if it would be invalid for :func2. Invalidity could be caused by multiple reasons, of course.

  1. opt1 is not allowed by :func *, or
  2. opt1=oval1 is not allowed by :func2, or
  3. opt1=oval1 would be allowed by :func2, but conflicts with one or more of the options that are explicitly set (opt2=oval2x, opt3=oval3).

* We could (optionally) allow functions to specify some implicit function/option pairs that they would not allow explicitly.

Trivially, if func2 = :func1, then opt1 is allowed unless it conflicts (3).

So then the question becomes: what are some powerful use cases that would justify exceptions to this policy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@catamorphism I think you just did express it :😉? I think we could say in the text about function resolution that the options passed are immutable by the handler. The resolved value might contain additional options. I'm wondering if it might also modify an option (consider DSOs in the other PR!) I could see not allowing function local mods to show.

@macchiati we already allow functions to ignore options they don't care about. And we allow functions to disallow combos and values. It's okay for chained functions to result in Bad Option, even mid chain. That just makes sense. But the option isn't removed, just ignored (we already say that)

(Doing this on a phone, so handicapped in my response)

@eemeli
Copy link
Collaborator

eemeli commented May 2, 2025

One aspect we have not discussed yet is prior art, and how similar APIs elsewhere behave. Do ICU APIs provide for the inspection of any option-like values in intermediate formattable values? If so, do they modify their input values at all?

In JS there's

const nf = new Intl.NumberFormat(undefined, { style: 'currency', currency: 'eur' })
nf.resolvedOptions().currency === 'EUR'

where the option values are parsed, filtered, and normalized.

This seems quite relevant to me, as the authors of new messages are likely to be developers, and their presumptions about MF2 will draw on their experiences with related i18n and L10n APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions Issue pertains to the default function set normative Issue affects normative text in the specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEEDBACK] Conformance for currency and unit should be more strict
6 participants