Add use case for `source` expression attribute #772

eemeli · 2024-04-21T15:38:55Z

While working on moz.l10n, a new Python localization library that uses the MF2 message and resource data model to represent messages from a number of different current syntaxes, I've come across at least the following use cases for expression attributes:

In addition to supporting a limited set of HTML elements, Android String Resources may use <xliff:g> to wrap nontranslatable content. This is best represented in MF2 with a @translate=no attribute.
Web extension messages.json files allow for named placeholders that are mapped to indexed arguments. These may include an example, which is best represented in MF2 as an @example=... attribute.
Apple's Xcode supports localization of plural messages via .stringsdict XML files, which encode the plural variable's name as a NSStringLocalizedFormatKey value, where it appears as e.g. %#@countOfFoo@ or similar. To display only the relevant "countOfFoo" name of this variable to localizers as context, it's best to use a @source=... attribute on the selector.

The first two use cases are already documented, but the last one is not; it's added by this PR.

The overall use case of the underlying work is to make use of the MF2 data model to provide a unified representation of messages in many different syntaxes, so that e.g. validation and a UI for plural message editing can be applied to all formats, rather than needing separate parsing and handling for each.

aphillips · 2024-04-21T17:17:36Z

Android String Resources may use xliff:g to wrap nontranslatable content. This is best represented in MF2 with a @translate=no attribute.

An argument can be made that this is a job for markup, since, after all, an XLIFF processor might want to directly consume the g.

Apple's Xcode supports localization of plural messages

Side thought: we probably want to chat with the Apple, Android, and MSFT folks about adopting MF2 into some of their resource/API syntaxes.

it's best to use a @source=... attribute on the selector

I'm not sure I understand the @source annotation you're proposing. Why wouldn't the caller just assign the value to a named argument to MF2 in the setup to calling the formatter? Why does the translator need to know the original name?

Apple's doc furnishes this example of the format you're talking about:

<plist version="1.0">
    <dict>
        <key>%d home(s) found</key>
        <dict>
            <key>NSStringLocalizedFormatKey</key>
            <string>%#@homes@</string>
            <key>homes</key>
            <dict>
                <key>NSStringFormatSpecTypeKey</key>
                <string>NSStringPluralRuleType</string>
                <key>NSStringFormatValueTypeKey</key>
                <string>d</string>
                <key>zero</key>
                <string>No homes found</string>
                <key>one</key>
                <string>%d home found</string>
                <key>other</key>
                <string>%d homes found</string>
            </dict>
        </dict>
    </dict>
</plist>

Isn't this represented in MF2 as:

.input {$homes :integer}
.match {$homes}
0 {{No homes found)}
one {{{$homes} home found}}
* {{{$homes} homes found}}

The %#@homes@ is needed to bind homes to the sprintf-style positional arguments (%d in the example). Presumably MF2 already does this by name.

eemeli · 2024-04-21T21:28:27Z

Android String Resources may use xliff:g to wrap nontranslatable content. This is best represented in MF2 with a @translate=no attribute.

An argument can be made that this is a job for markup, since, after all, an XLIFF processor might want to directly consume the g.

I'm building a workflow where the source content can be parsed into an MF2 data model, modified, and then reserialised in the original format. So there isn't necessarily any XLIFF processor involved here, and even if there were, the use of <xliff:g> is completely custom in the Android format, and does not match with the "generic group placeholder" meaning that the XLIFF spec places on it. Hence representing the intent of the original syntax with an attribute, rather than modelling the input exactly.

it's best to use a @source=... attribute on the selector

I'm not sure I understand the @source annotation you're proposing. Why wouldn't the caller just assign the value to a named argument to MF2 in the setup to calling the formatter? Why does the translator need to know the original name?

In this case, there is no formatter involved in the workflow, so the source needs to be retained to allow for a later serialisation in the format that the iOS or MacOS formatter will be able to process. For the translator, the name of the variable can be an informative part of the message's context, and it's much clearer when lifted out of its syntax trappings.

Isn't this represented in MF2 as:
.input {$homes :integer}
.match {$homes}
0 {{No homes found)}
one {{{$homes} home found}}
* {{{$homes} homes found}}
The %#@homes@ is needed to bind homes to the sprintf-style positional arguments (%d in the example). Presumably MF2 already does this by name.

Yes, and in the MF2 representation the %#@homes@ string is needed to reliably transform the MF2 back into the corresponding stringsdict value. Sometimes it also carries a positional indicator, and other content; it's not always a %#@ prefix and @ suffix to the variable name.

aphillips · 2024-04-21T23:08:11Z

For the translator, the name of the variable can be an informative part of the message's context, and it's much clearer when lifted out of its syntax trappings.

Agreed, but one could extract the name (and/or decorate) the name to generate the expression operand. I understand that the NSStringLocalizedFormatKey is actually a construct for enumerating what we'd call operands and aligning them with classical "placeholders". You have to parse that string in your implementation, IIUC (not having worked with it, only having glanced at the documentation).

Yes, and in the MF2 representation the %#@homes@ string is needed to reliably transform the MF2 back into the corresponding stringsdict value. Sometimes it also carries a positional indicator, and other content; it's not always a %#@ prefix and @ suffix to the variable name.

👍

So there isn't necessarily any XLIFF processor involved here, and even if there were, the use of xliff:g is completely custom in the Android format, and does not match with the "generic group placeholder" meaning that the XLIFF spec places on it. Hence representing the intent of the original syntax with an attribute, rather than modelling the input exactly.

Understood, but there is Android's processor and this does still look like markup in that context. FWIW, XLIFF elements are implemented in many different ways by different tools. So there are many dialects already.

Overall, what you're doing can obviously work. I'm just curious whether we already provide the necessary constructs.

Thought: does this suggest the need for namespaced or custom attributes? @source is fine, but maybe @moz:source would avoid conflicts with other interpretations in tooling downstream?

eemeli · 2024-04-22T06:49:53Z

Agreed, but one could extract the name (and/or decorate) the name to generate the expression operand. I understand that the NSStringLocalizedFormatKey is actually a construct for enumerating what we'd call operands and aligning them with classical "placeholders". You have to parse that string in your implementation, IIUC (not having worked with it, only having glanced at the documentation).

Eh, or I can just extract the relevant-to-translators bit out of it (the variable name), and leave the rest as line noise that I hide away. The "IIUC" bit that you mention is hard here, because this syntax isn't well documented, and I'm not myself 100% confident I've understood all of it.

Understood, but there is Android's processor and this does still look like markup in that context.

Yes, and in some cases like

<xliff:g><b>foo</b></xliff:g>

I do need to leave it in as markup like

{#xliff:g @translate=no}{#b}foo{/b}{/xliff:g @translate=no}

but that's less useful and less friendly to a translator or tooling than e.g. representing

<xliff:g id="user" example="Bob">%1$s</xliff:g>

as

{$user :xliff:g example=Bob @translate=no @source=|%1$s|}

Thought: does this suggest the need for namespaced or custom attributes? @source is fine, but maybe @moz:source would avoid conflicts with other interpretations in tooling downstream?

That's actually a big part of why I opened this PR. If we find agreement on what a @source attribute is supposed to mean, then I don't need to use a namespaced one.

mihnita · 2024-04-22T16:43:13Z

Note that the way <g> the way is used in the Android files is bad.

It is meant to declare the text between <g>...</g> as non-localizable.
But in XLIFF the content between the tags is very much localizable.
The <g> is intended to use for things like <b>, <i>, and so on.

I though that "do not translate" is already representable in MF2 as "...{|don't translate this|}..."

eemeli added 3 commits April 21, 2024 18:18

Add use case for "source" attribute

5150a97

Add self-referential PR link

b46af77

Fix typo

e170c00

aphillips added syntax Issues related with syntax or ABNF design Design document or issues related to design LDML46 LDML46 Release (Tech Preview - October 2024) labels Apr 21, 2024

aphillips merged commit a037ba7 into main Apr 22, 2024
1 check passed

aphillips deleted the source-attribute branch April 22, 2024 16:41

ZL91 approved these changes May 11, 2024

View reviewed changes

eemeli mentioned this pull request May 14, 2024

[DESIGN] Add user stories / build-out of the expression attributes #792

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add use case for `source` expression attribute #772

Add use case for `source` expression attribute #772

eemeli commented Apr 21, 2024

aphillips commented Apr 21, 2024

eemeli commented Apr 21, 2024 •

edited

Loading

aphillips commented Apr 21, 2024

eemeli commented Apr 22, 2024

mihnita commented Apr 22, 2024

Add use case for source expression attribute #772

Add use case for source expression attribute #772

Conversation

eemeli commented Apr 21, 2024

aphillips commented Apr 21, 2024

eemeli commented Apr 21, 2024 • edited Loading

aphillips commented Apr 21, 2024

eemeli commented Apr 22, 2024

mihnita commented Apr 22, 2024

Add use case for `source` expression attribute #772

Add use case for `source` expression attribute #772

eemeli commented Apr 21, 2024 •

edited

Loading