Skip to content

proposal: replace first-match with best-match #351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aphillips opened this issue Feb 15, 2023 · 20 comments
Closed

proposal: replace first-match with best-match #351

aphillips opened this issue Feb 15, 2023 · 20 comments
Labels
Agenda+ Requested for upcoming teleconference blocker-candidate The submitter thinks this might be a block for the next release design Design document or issues related to design resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.

Comments

@aphillips
Copy link
Member

Is your feature request related to a problem? Please describe.
I would like to reconsider the choice of "first-match" in pattern selection by proposing that we switch to "best-match". I have prepared an explainer.

Describe the solution you'd like
See explainer.

Describe why your solution should shape the standard
This is a core feature.

Additional context or examples
See explainer.

@aphillips aphillips added design Design document or issues related to design blocker-candidate The submitter thinks this might be a block for the next release Agenda+ Requested for upcoming teleconference labels Feb 15, 2023
@eemeli
Copy link
Collaborator

eemeli commented Feb 16, 2023

As far as I can tell, the problem we are trying to solve here has a specific expression in the treatment of plural matching, i.e. whether to select 1 or one where applicable. As we don't want to explicitly special-case plurals into the spec, we do need a general-case solution which somehow solves the problem at least for plurals. However, before getting into the problem space further, a question:

Are we aware of any other selectors which might have different or further needs than plurals, or can we be satisfied that any solution which is determined to be sufficient for plurals is sufficient for all selectors?

@aphillips
Copy link
Member Author

Good callout.

I suspect time-based selectors will be similar to plurals, particularly when messages need to decide between floating (aka "local") and incremental (aka "instant" or possibly "zoned" values) or other places where precision varies.

Similarly some of the Relative formatters break behavior above or below certain values. Person name formatting might also match patterns based on the fields in a name plus locale factors.

And maybe measurements more generally (such as breaking from mg => g => kg or from ounces => pounds)?

@macchiati
Copy link
Member

macchiati commented Feb 27, 2023

I think the three possible approaches are (in brief):

  1. Column first — filter out the best keys for the first expression (eg all the '1' values), then the best for the second, etc. This basically reproduces the MF1.0 algorithm.
  2. First match wins That puts the entire onus on the generator of the message. It is a very fragile approach since it will be very easy to mess it up.
  3. Weighted match wins. Each selector assigns a value from 0 to 1 (or could be integers 0..100). For each row, multiply the values together, and pick the row with the largest value. (Multiply, so that a zero in any key makes the whole row fail).

As I said when this came up, I think that a company like mine could work around the fagility of #2. But only by having rigourous control over the source message and the translation pipeline.

  • The source message would have to either fail or be normalized to be correct (probably by using #1 or #3 to reorder). That can happen when it is picked up for translation.
  • The translated message would need to have all NxM possibilities (because we can't predict in advance which variants would collapse together). It would also be normalized for the language.
  • Once translated, it might be possible to simplify, eg if a variant message V is the same as the * * * row message, and there wouldn't be any matches in between them.

BTW, some of your examples really are suitable for selectors, but instead need to be internal to a formatter. For people names, there are many different input parameters, and optionally multiple patterns per input parameter. Or take measurement units. They also need to be one level down, because what needs to be supplied is a

  • source unit of the given quantity (like 3 square meters),
  • and usage (like usage=room),
  • and the locale (like Japan)
  • and (ideally) the grammatical features needed (eg dative)

Then the appropriate unit and amount of that unit are formatted. That process isn't really appropriate or feasible for the selector mechanism. (I can flesh this out more if there are any questions.)

@mihnita
Copy link
Collaborator

mihnita commented Mar 3, 2023

Weighted match wins. Each selector assigns a value from 0 to 1 (or could be integers 0..100). For each row, multiply the values together, and pick the row with the largest value. (Multiply, so that a zero in any key makes the whole row fail).

+100

The only improvement (?) I would make is to use the sum of squares if the scores.
That is because is matches the natural definition of distance in the real word

This matches the intuitive idea of distance in the real world.

The distance from origin to a point:

  • in 1D space is x (same as √(x²))
  • in 2D is √(x² + y²)
  • in 3D is √(x² + y² + z²)
  • in n dimensions is √(d₁² + d₂² + ... + dₙ²)

In our case there is no need to extract the root, as we compare the scores.

I think the main benefit is that it matches the natural distance, so it is not that artificial, and is probably intuitive.


I posted a bit more detailed comment on the commit (#358) https://github.com/unicode-org/message-format-wg/pull/358/files#r1125011916

With a bit more details and pseudocode.

@stasm
Copy link
Collaborator

stasm commented Mar 13, 2023

I think I've had a bit of a revelation which I'd like to share before I go to sleep today. (We'll see whether I still think of it as a revelation tomorrow morning...)

With first-match, it's possible to arrange variants such that the meaning of the * is "any other". Yes, it requires work and can be error-prone or even impossible in some tools, but that's how I've always read it: match anything that didn't match up to this point.

match {$count :plural}
when 1 {A thing.}
when * {Many things, definitely not a single one.}

With best-match, *'s meaning is changing to "any". It's not "any other"; just "any". And I think that's what's been bugging me.

If I imagine a message with the following variant:

when one masculine masculine {{$count} message from his brother.}

Such variant may win over 1 * * or * masculine masculine, which may be problematic if * was meant for non-masculine or non-1 values.

Thus, #322 is closely related.

@stasm
Copy link
Collaborator

stasm commented Mar 13, 2023

Such variant may win over 1 * * or * masculine masculine, which may be problematic if * was meant for non-masculine or non-1 values.

Argh, I edited this example a few times before posting and ended up getting it wrong. I meant to say:

Such variant may win over 1 * *, which may render 1 * * unreachable. Moreover, 1 * * may also lose to * masculine masculine, which may be problematic if * in * masculine masculine was meant for non-1 values.

@aphillips
Copy link
Member Author

@stasm I guess it is kind of like that. The thing is: the value * doesn't mean "any other" to the selector. It always means "any", because the selector match isn't aware of other keys--only does "foo" match (some key value). It can only mean "any other" in the overall context of the match. That is, we'd generally prefer to find some other value before we match *, but we're happy to match to *. The selector is stateless when evaluating * (or any other key value).

The "value" of a * match can have greater or lesser quality depending on the selector in question--* matches the other value in a plural for the en locale for values > 1 better than any of the keywords (one, two, few, many, zero) values. In pl it never matches as anything better than "any other value", IIRC, and is somewhat superfluous.

Using your example:

when one masculine masculine {{$count} message from his {$male-relation}}
when 1   *         *         {One message from {$unknown-gender} {$unknown-relation}}
when *   masculine masculine {{$count} messages from his {$male-relation}}

For English, */masculine/masculine is clearly the plural variation of one/masculine/masculine and values like 2, 3, 11, 42 for $count match that line perfectly (in masc/masc cases). The */masculine/masculine message can not beat both other lines in en for the value 1 in the first column. In "scored" matching it might beat 1/*/* (in fact, we want it to for the value 2), but not it's more specific friend one (for the value 1). In ja, though, the one rule never fires and then you might get a different value out.

However, this is a perverse message because the matching is clearly incomplete. There are many cases where you can't get a reasonable value out of it. First match doesn't work either with only that set of messages!

I think the note I have about non-plural complex match cases is helpful here. I don't know what the gender determination means, so I can't write a "perfected matrix", but what I do know is that there is some (possibly locale-affected) set of values that the selector can produce. In fact, in your example the two masculine values are not the same thing and are probably from different selectors! The first one controls the use of the adjectival pronouns "his"/"her" (and probably "their") in English. The second one has to do with the gender of the noun (with only a few exceptions, English doesn't vary grammatical form for inserted nouns and the value * would probably apply).

I'd be careful about using speculative examples (I've mostly stuck to plural in examples) because we can get wrapped up in hard-to-understand hypotheticals. If we instead describe what a selector can do, we want to make a complete test case to describe what happens and where the failures are. To evaluate if 1/*/* is bad output (or if */masc/masc is), we need to know what the "missing" values in the matrix are. And as I said above, even first match returns wrong strings without a perfected matrix.


Finally, note that a list of complex selectors produces a huge matrix (98 entries in Polish for my example--more in Arabic or Russian!). I think Elango's call out in the call is a good one. Even if for survival purposes tools and developers always write the matrix in first-match order (to help them ensure that they don't miss a case!), the chances of making a mistake grow with complexity. Having the runtime fix the matrix for you is better than the penalty of it erroring on a mal-ordered matrix.

The penalty for getting the order wrong anywhere in a long matrix of entries in first-match might be severe and my intuition says that, for any matrix (complete or incomplete) having the runtime fix the order of the matrix does not produce a different result than having had the optimal order in the first place.

I think the onus on first-match would be to show a case where a non-optimally ordered matrix has value in excess of the burden on everyone else. Are there any use cases for non-optimally order matrices?

@stasm
Copy link
Collaborator

stasm commented Mar 14, 2023

However, this is a perverse message because the matching is clearly incomplete. There are many cases where you can't get a reasonable value out of it. First match doesn't work either with only that set of messages!

You're right, the example is incomplete. In a real-life scenario it's unlikely that this message would omit other, better variants. But I think my point still stands -- even in best-match, we cannot analyze variants completely independently, because the presence of other variants defines what * means in any of them.

I'd be careful about using speculative examples (I've mostly stuck to plural in examples) because we can get wrapped up in hard-to-understand hypotheticals.

I think it would be easier to find more realistic examples in other languages, but for the sake of the discussion we stick to English, whose grammar rules don't require a lot of complexity that we're designing for.

And as I said above, even first match returns wrong strings without a perfected matrix.

Correct, the issue of the unclear meaning of * is common to all discussed strategies. I'm not trying to defend first-match here, but I'll note that in first-match at least we know that we need the full message anyways. In best-match, one of the arguments is that it's possible to consider individual variants independently of each other. My point is that because of the * that's not actually true.

Even if for survival purposes tools and developers always write the matrix in first-match order (to help them ensure that they don't miss a case!), the chances of making a mistake grow with complexity. Having the runtime fix the matrix for you is better than the penalty of it erroring on a mal-ordered matrix.

Ack, this is one of the benefits of best-match, and I agree with it. I'm trying to make best-match even more robust by fixing *.


The point about being able to consider individual variants independently of each other has been one of the most compelling arguments in favor of best-match for me. My mental model is that it would make it possible to just look at a single variant at a time and know how to translate it:

when one masculine masculine {{$count} message from his {$male-relation}}

This would be great for l10n workflows which only send a subset of translation units for localization, great for tooling and QA because the translator only needs to consider one masculine masculine, and great for leveraging and machine translation.

However, am I right to say that * makes this argument moot? Consider the following message:

match {$CURRENT_SYSTEM :equals}
when windows {Settings}
when * {Preferences}

If we look at each variant separately, we'll end up with:

when * {Preferences}

...which isn't enough of a "spec" for translators to know how to translate it. Only by looking at the other variant in the message can they know what * means.


This discussion makes me consider two other topics:

  • Complete key sets should be preferred over those with *. In English plurals, having two variants for one and other is better than one and *. That's Handling `other` vs. `*` for plural selectors #322 and probably best discussed there.
  • We may need a new key syntax for alternatives, e.g. when (macos or gnome) {Preferences} or when [macos gnome] {Preferences}. This way we can satisfy exhaustiveness requirement without having to duplicate variants and without using *.

@aphillips
Copy link
Member Author

@stasm asked:

However, am I right to say that * makes this argument moot?

I don't think so. I do think that the match statement is needed to understand what the * means--or, in fact, to understand what any key means. It's also true that you can't necessarily "see" what the other values were and that creating a translation might sometimes influence what the translator does.

But then, there are lots of message format patterns (or just plain strings) that lack sufficient context for a good translation by themselves.

Does it help if you mentally replace the term * with the word other?

We may need a new key syntax for alternatives, e.g. when (macos or gnome) {Preferences} or when [macos gnome] {Preferences}. This way we can satisfy exhaustiveness requirement without having to duplicate variants and without using *.

I think this would be scary. The source locale (i.e. the developer) might lump together items that need to be separate in another locale.


There is another reason why a *-like term is necessary or helpful. When an enumeration is expanded later or when the output for a given locale changes (Elango points out the change in plural rules for Romance languages in a recent CLDR release), it breaks any messages that do not have a default value to fall back on. Since the runtime is updated with the OS or browser, not necessarily the application, this can break messages installed in the field. Having * (or other or whatever) prevents this from being a hard failure.

@eemeli
Copy link
Collaborator

eemeli commented Mar 14, 2023

However, am I right to say that * makes this argument moot?

I think so? In English this is easiest for me at least to see by starting with a degenerate case like

match {$count :number}
when * {{$count} thing(s)}

where initially the * covers both the one and other categories. Now, if a one variant is introduced and we only send that to be translated, we end up with

match {$count :number}
when one {{$count} thing}
when * {{$count} thing(s)}

and that's obviously wrong, because introducing the one has narrowed the meaning of the * to only other, so the updated message should really become

match {$count :number}
when one {{$count} thing}
when * {{$count} things}

As I mention, this is a bit of a degenerate case, but effectively something like this happens every time you introduce a new variant, independently of how the selection happens. If we're deeply aware of how selection happens, then we might be able to look at the variants and deduce which ones are potentially affected by an update. For instance, when introducing a 1 variant to our example,

match {$count :number}
when 1 {the thing}
when one {{$count} thing}
when * {{$count} things}

we might see that in many locales this narrows the meaning of one because 1 is a subset of it, but does not change the meaning of *. This would allow for leaving out the last variant from translation in some locales; it only needs to be updated in locales that don't include a one category.

@aphillips
Copy link
Member Author

aphillips commented Mar 14, 2023

@eemeli I disagree with your logic somewhat?

Introducing a 1 key does not narrow the one case. It introduces a key with a better quality of match than one (in certain locales). And, in fact, when used correctly (...there is no guarantee that it will be used correctly...) the 1 case will actually say something specific to the value 1 vs. the keyword's use of a pattern string.

The existence of a 1 key does not change the meaning of the one key or the "quality" of its match. For that matter, the existence of the one rule doesn't change the meaning of the * (or other) key nor that "quality" of its match in that locale. The selector does not care or need to know if other when clauses exist and the contents of the pattern should be patterned appropriately as output for that key.

Here's a different illustration. Let's consider what I recommend for developers to write for a basic plural in the root locale:

match {$count :plural}  // not :number, btw
when 0 {You have {$count :number} things} // can be worded for zero differently
when one {You have {$count :number} thing} 
when * {You have {$count :number} things}

Now let's compile our application with that message--and no localizations--and run it in the pl-PL locale.

For the value count == 2, we generally say that the Polish plural rules "want" to match the keyword few. But there is no few key in the English (root) resource. The when 0 and when one keys return "no match" for that item. But * (here in its role as other) does match. So the message is functional.

Now we run in the ja-JP locale.

For the value count == 1, the Japanese plural rules "want" to match the keyword other (i.e. *) as Japanese does not have a grammatical plural. The resulting message is You have 1 things, which is grammatically incorrect in English, but you'll solve this with localization later. Again the one and * keys have not changed their stripes. It's the match quality that is different. * always matches in ja-JP and one never matches in that locale.

We do write different keys when we localize to different locales to handle these cases. And we do introduce special cases like when 1 or when 2 to handle business-logic driven selection that overrides and comes before the plural rule (when 1 {This is your last chance before you are locked out of the system})

Using a different example from my not-plural examples. I can write a case like:

match ($item :match_category)
when computers {{$item} is on sale in computers}
when electronics {{$item} is on sale in electronics}
when * {{$item} is on sale}

... where computers is a sub-category of electronics which, in a way, is a subcategory of *. All three when clauses match a new laptop, whereas only electronics and * match a new TV. And the laptop might even match a further subcategory of laptop computers. A product manager can come along at any time and add a new when laptop {{$item} is on sale in laptops} message.

What I'm trying to say is: the when clause does not do the matching. The selector does. And the selector cannot see other when clauses or even other selectors in the same tuple. It can only say: given $item == (a laptop) and key == electronics do you match (or "how well do you match")?

IOW, @stasm is correct that * means any. The interpretation that it meant any other is only a post facto interpretation of the whole message. And, as we've seen in a few places, sometimes the quality of * as a match is higher than just meaning any. That is, sometimes it is the Best Match 😉...

@eemeli
Copy link
Collaborator

eemeli commented Mar 15, 2023

@aphillips:
The existence of a 1 key does not change the meaning of the one key or the "quality" of its match. For that matter, the existence of the one rule doesn't change the meaning of the * (or other) key nor that "quality" of its match in that locale. The selector does not care or need to know if other when clauses exist and the contents of the pattern should be patterned appropriately as output for that key.

Here's a different illustration. Let's consider what I recommend for developers to write for a basic plural in the root locale:

match {$count :plural}  // not :number, btw
when 0 {You have {$count :number} things} // can be worded for zero differently
when one {You have {$count :number} thing} 
when * {You have {$count :number} things}

Isn't this self-contradictory? If the * variant were to "not care or need to know if other when clauses exist", then its contents could not make the assumption that the plural category of {$count :number} is other, as is done in your example.

@stasm
Copy link
Collaborator

stasm commented Mar 15, 2023

For the value count == 2, we generally say that the Polish plural rules "want" to match the keyword few. But there is no few key in the English (root) resource.

(This might be a topic for a different discussion.) While I understand what you're saying here, @aphillips, I also question that in this example we should consider calling Polish formatters in English messages. If there's no translation and the UI falls back to the source language, messages should use the source language's formatters and matchers.

@stasm
Copy link
Collaborator

stasm commented Mar 15, 2023

Isn't this self-contradictory? If the * variant were to "not care or need to know if other when clauses exist", then its contents could not make the assumption that the plural category of {$count :number} is other, as is done in your example.

I agree with @eemeli here. If * means "any", that message should actually read as follows, because * will match, well, any value of $count.

match {$count :plural}  // not :number, btw
when 0 {You have {$count :number} things} // can be worded for zero differently
when one {You have {$count :number} thing} 
when other {You have {$count :number} things}
when * {You have {$count :number} thing(s)}

@stasm
Copy link
Collaborator

stasm commented Mar 15, 2023

(And then, because we should be able to detect that one and other satisfy the exhaustiveness criterion for English, it should also be possible to drop the * case entirely. Which brings us back to #322.)

@stasm
Copy link
Collaborator

stasm commented Mar 15, 2023

There is another reason why a *-like term is necessary or helpful. When an enumeration is expanded later or when the output for a given locale changes (Elango points out the change in plural rules for Romance languages in a recent CLDR release), it breaks any messages that do not have a default value to fall back on. Since the runtime is updated with the OS or browser, not necessarily the application, this can break messages installed in the field. Having * (or other or whatever) prevents this from being a hard failure.

I know and understand that this has happened and can happen again. Are we saying that graceful handling of such changes is a hard requirement for MF2?

@stasm
Copy link
Collaborator

stasm commented Mar 15, 2023

(I apologize for the volley of comments this morning.) Let me take a step back and try to rephrase the problem.

When * is used to mean "any", best-match can reliably produce good results. In the item-category example, it's correct to score * as a small positive match. The translation is correct, even if there are other variants offering better translations which should be scored higher.

match ($item :match_category)
when computers {{$item} is on sale in computers}
when electronics {{$item} is on sale in electronics}
when * {{$item} is on sale}

However, when * is used to mean "any other", best-match can produce not only suboptimal results, but plainly wrong ones, like in the count-things example:

match {$count :plural} 
when 0 {You have {$count :number} things}
when one {You have {$count :number} thing} 
when * {You have {$count :number} things}

For $count = 1, the best-match strategy will incorrectly score the * variant as "good enough". Perhaps not 100% good, but, let's say, 10% good. But that's not right. things is not the right grammatical form when the count is 1.

Of course, in this particular example, the issue is mitigated by the presence of the one variant. Scoring an incorrect variant as "10% good" is fine as long as there also exists a correct variant which gets a higher score.

I'm concerned, however, that in multi-selector messages this problem will manifest more often, and will lead to incorrect translations. That's because * will act as a filter for accepting variants which end up with higher scores from other selectors. To prevent * masculine masculine from winning for $count = 1 we must somehow enforce that one masculine masculine also exists in the same message.

How do we do it?


This problem also applies to first-match and incomplete messages, but it can be mitigated by reordering variants. Reordering effectively changes the meaning of * from "any" to "any other".

@aphillips
Copy link
Member Author

@stasm
I know and understand that this has happened and can happen again. Are we saying that graceful handling of such changes is a hard requirement for MF2?

No, not really. "Graceful" is in the eye of the beholder. What I'm saying is more like: "if you have a * you won't have a hole in your match"

@stasm
(This might be a topic for a different discussion.) While I understand what you're saying here, @aphillips, I also question that in this example we should consider calling Polish formatters in English messages. If there's no translation and the UI falls back to the source language, messages should use the source language's formatters and matchers.

This isn't how I18N APIs work! 😀 Localizations often fall back (particularly during development, before the translations are available), but that doesn't mean that you want to lose the formatter or selector behavior. The solution for having English is to provide the localization, but you want the functionality to match the locale expected. Otherwise, for example, you can't test using pseudo or use features that depend on the locale.

I often write I18N demos with the equivalent of:

match {$count :plural}
when zero {You have (zero) {$count :number} things}
when one {You have (one) {$count :number} thing}
when two {You have (two) {$count :number} things}
when few {You have (few) {$count :number} things}
when many {You have  (many) {$count :number} things}
when * {You have (other) {$count :number} things}

And then code like:

// the loop over the locales is usually replaced by a list box
for (Locale locale : Locale.getAvailableLocales()) {
   MessageFormat fmt = new MessageFormat(abovePattern, locale);
   for (int count = 0; count < 30; count++) {
       Map args = ImmutableMap.of("count", count);
       System.out.println(String.format("%d: %s", x, fmt.format(args));
   }
}

When one sees my zero/one/two example above, the temptation is to think that * is unreachable (that would be "any-other"). But the selector doesn't see the list--it only sees each key, one at a time. It can pick * as the match in English for values 0, 2, 3... "max int". In Japanese it is picked for all integers. In Polish it is picked for no integers. But in every locale it matches every integer... just a little bit in some and quite a bit in others.

At Amazon I had the equivalent of the above localized. You could run it on our demo portal with the portal in any language doing selection in another locale (it required a bit of work to get the right outcomes with the translators). Maybe @zbraniecki can take a screenshot of the demo if it is still there 😺.

@stasm
things is not the right grammatical form when the count is 1.

Yes, that's correct. At some point this is unavoidable. Somewhere in this thread there's an example of trying to make the following generic in English:

when * {You have {$count :number} thing(s)}

This string is impossible to translate into many languages for the same reason we tell developers to use a plural format in the first place. You can't fix (e.g.) Polish orthography to be generic in the same way you can English.

Similarly, this is a bad source string:

when one {You have one item}

... because, while that is correct English, that is making the assumption that the locale that selected this message was an English locale. These are the same problem. The solution to the count=1/You have 1 things problem is to provide e.g. a Japanese (or whatever) translation already! 😉

Anyway, @stasm goes on to note:

I'm concerned, however, that in multi-selector messages this problem will manifest more often, and will lead to incorrect translations. That's because * will act as a filter for accepting variants which end up with higher scores from other selectors. To prevent * masculine masculine from winning for $count = 1 we must somehow enforce that one masculine masculine also exists in the same message.

How do we do it?

Yes, absolutely the problem will manifest more often with a matrix selector. As noted, some of these matrices will be hairy (98 entries in Polish for the three plural selector example with only one special case).

I don't agree that * acts as a filter so much as it makes some messages "candidates" that would be eliminated in the "any other" interpretation. This can be avoided as a problem by ensuring that you have the complete matrix for the target locale in the localized resource. If one doesn't complete the matrix, one will definitely have some ungrammatical messages leak out. This is not a bug in MessageFormat (it is a bug in one's message).

We also cannot prevent developer from writing the bad "You have one item" message. We can only educate them.

@aphillips aphillips added the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Jun 16, 2023
@aphillips
Copy link
Member Author

Closing per 2023-06-19 telecon discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agenda+ Requested for upcoming teleconference blocker-candidate The submitter thinks this might be a block for the next release design Design document or issues related to design resolve-candidate This issue appears to have been answered or resolved, and may be closed soon.
Projects
None yet
Development

No branches or pull requests

5 participants