Define @attributes on expressions #450

eemeli · 2023-08-14T21:43:08Z

Fixes #426

As discussed during today's call, we have use cases for attributes that pertain to an expression, rather than acting as function options. The examples presented in #426 refer just to the locale of an expression, but other valuable attributes include the translatability of an expression, and attributes such as canCopy, canDelete, canOverlap, and others defined in the XLIFF 2.1 spec. Many of these do not have an impact on message formatting, but can be very important for translation work.

This PR proposes the following syntax for this, using the examples included in the PR:

A message in English including a French literal that should not be translated:
{In French, "{|bonjour| @locale=fr @translate=no}" is a greeting}
A message with a $date variable formatted with the :datetime function using the root "und" locale:
{Today is {$date :datetime weekday=long @locale=und}.}
A message with a $start variable that should in translations be kept as the first element of the message.
{{$start @can-reorder=no} is the beginning.}

Effectively, after the function options, expression attributes may be defined with the same option syntax but with @ as a prefix for each name. The following attributes are explicitly defined:

@locale: a sequence of IETF BCP 47 language tags
@translate: a literal value of either yes or no.

All other custom attributes must include a hyphen - in their name (such as @can-reorder above); everything else is ignored. Mismatching names can't be treated as a syntax error, because that would limit our later ability in a 2.1 spec version to define more attributes.

This PR also unreserves the @ character and updates the data model descriptions as necessary.

macchiati · 2023-08-14T21:58:30Z

The @can-reorder=no doesn't look very useful. Not a blocker for this PR, but it should have a better example in the future.

It isn't clear what it would mean, but I'm guessing:

in the translation,

all placeholders that appear before this placeholder appear before this placeholder in the original.
all placeholders that appear after this placeholder appear after this placeholder in the original.

So if I had some paired placeholders like the following (pseudocode, cr = can-reorder), the translator couldn't reorder the text as needed for their language

.x. {bold-start @cr=no} .y. {bold-end @cr=no} .z.  {italic-start @cr=no} .w. {italic-end @cr=no} .u.

The translator couldn't reorder the parts, eg

.X. {italic-start @cr=no} .W. {italic-end @cr=no} .U. {bold-start @cr=no} .Y. {bold-end @cr=no} .Z.

aphillips

Good start.

Be careful of whether it is @language or @Locale.

I think this is too opinionated about usage and needs to be slightly genericized.

See comments about namespacing. I think checking the interior of the name is too much work? I also think that we should explicitly define what implementations are allowed to define for themselves.

One normative thing is that unrecognized items are ignored, but does that mean that they are not passed downstream? Are they deleted from the data model, for example? We know how to process them, just not what they mean...

aphillips · 2023-08-14T21:49:33Z

spec/formatting.md

@@ -105,6 +105,9 @@ one of the following is used to resolve the value of the _expression_:
 - Else, the _expression_ has a _reserved_ _annotation_,
  an Unsupported Expression error is emitted and a fallback value is used as its value.

+_Expression attribute resolution_ defines the handling of _expression attributes_
+that have an impact on _message_ formatting.


I think this part of the sentence doesn't belong. The attributes need to be resolved if they are present. Whether they have an impact on formatting is up to the functions or processing involved in the formatting itself. Also, it should be noted that the attributes almost certainly "pass down" to parts during formatting (so that the caller can access them, if needed).

I'm not so sure about attributes getting passed as such to formatting or selection functions. If that makes sense, then why not pass them as function options?

Locale is special because

It's already handled as special by existing formatters, and

it may get a separate field in the formatted-parts output.

I can see how e.g. timezone could be handled similarly specially by setting up some aspect of the environment before a formatter call rather than considering it as a function option, but for that I'd expect it to be a thing that the implementation does, and which affects the formatter only indirectly.

The attributes need to be resolved if they are present.

That's not really in line with what we're saying about other ignored code. For example, we allow an implementation to ignore errors in expressions that aren't needed for the current formatting. Similarly, we should ignore errors in attributes that don't impact the formatting.

The attributes need to be resolved if they are present.

That's not really in line with what we're saying about other ignored code. For example, we allow an implementation to ignore errors in expressions that aren't needed for the current formatting. Similarly, we should ignore errors in attributes that don't impact the formatting.

The attributes need to be parsed, even if they are not recognized. The parser does not necessarily know what is supported downstream or in an implementation.

Similarly, we should ignore errors in attributes that don't impact the formatting.

Depends on what "error" means. If there is a syntax error, we can't ignore that, no? (some of this is about what "we" means here: It is up to the function to decide if an attribute is "bad" or just to ignore it. "Our" job is to get the options and variables to the function.)

spec/formatting.md

aphillips · 2023-08-14T21:56:37Z

spec/formatting.md

+     bind the _name_ of the _option_ to the resolved value in the mapping.
+   - Otherwise, do not bind the _name_ of the _option_ to any value in the mapping.
+4. Determine the _expression_ _locale_.
+   If the _expression_ has a valid `@locale` _expression attribute_,


Define "valid" here. Is @locale=|hello world| valid? What about @locale=dead-beef (sorta BCP47-like)?

That's defined in the expression attribute resolution section.

The problem here is that "valid" isn't clearly attached to @Locale or expression attribute.

If this were written upside down, it might be clearer, e.g.

Let expressionLocale be the message locale from the formatting context.
If the expression contains a well-formed @locale expression attribute, let expressionLocale be that value.

aphillips · 2023-08-14T22:01:44Z

spec/formatting.md

+   Otherwise, look up the _message_ _locale_ from the _formatting context_.
+5. Call the function implementation with the following arguments:
+
+   - The _expression_ _locale_.


I would expose the expression attributes here also. Locale may not be the only one--I certainly have timezone in mind and other attributes may be provided by the implementation or our specification that affect the formatter.

At least for the JS Intl.DateTimeFormat, timeZone is a regular option. Should we not handle it as such?

For other attributes, there is this below:

Implementations MAY assign meaning during formatting to other expression attributes

Should the text here make a reference to that as well, or is that implicit permission sufficiently clear?

For other attributes, there is this below:

Implementations MAY assign meaning during formatting to other expression attributes

Should the text here make a reference to that as well, or is that implicit permission sufficiently clear?

You're permitting implementations to assign meaning, but here I'm suggesting that functions can assign meaning to attribute values, including those unknown to the implementation. They can only do this if they have access to the values, of course.

Why would such a value be defined in the syntax as an expression attribute rather than as a function option?

spec/syntax.md

aphillips · 2023-08-14T22:38:17Z

spec/syntax.md

+To ensure future extensibility of this specification,
+all other _expression attributes_ with a _name_ not containing `-` MUST be ignored.


How about just allowing unrecognized attributes to be ignored?

Because then we could not later assign a meaning to them without potentially breaking user code.

Reserving a namespace is a separate thing from ignoring unrecognized values. These can work in tandem. An MF2.0 implementation would ignore the unrecognized value @foo that MF2.1 unreserves, as well as @_bar that some other proprietary implementation is using.

Alternatively (and as I suggested elsewhere), as long as the values are syntactically correct, the implementation can just parse the values without imputing meaning to them and pass them. It only has to apply validation rules to attributes it "knows" (and do we define an error for that???)

Consider the following hypothetical, in a world where we allow implementations to assign their own meaning to any attributes not explicitly defined in the spec like @locale and @translate:

Let's say that an implementation defines @fallback to provide a value that should be used for a placeholder, should its formatting fail for whatever reason. This doesn't really make sense as a function option, because it's covering the behaviour of what happens if the function throws an error. Other implementations could just ignore this attribute, and use the spec-defined fallback representation.

Then let's say that in a 2.1 version of the MF spec we'd like to define @fallback to mean a fallback locale to use, if the function doesn't support the message locale, or the one set by @locale. This hypothetical attribute would have the same semantics as @locale, but it would append values to the language priority list rather than replacing it.

But in this world we could not do that without potentially breaking userspace, because an implementation as defined above could exist, along with a pre-existing body of messages using @fallback with a completely different meaning. In this case, the meanings of "fallback" are quite different, but even should they align, it's very likely that the detailed implementation and requirements could differ for the attributes. In other words, we would not be able to define any future attributes in the spec without a danger of breaking userspace.

spec/syntax.md

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

aphillips · 2023-08-15T14:15:57Z

spec/data-model/README.md

+The `attributes` of an `Expression` primarily represent translator hints,
+but could include ones such as `locale` that do have an impact on _message_ formatting.


I remarked on this before: this is too opinionated. Just because you are thinking of translator hints at this moment does not make them them the sole or even primary reason for this feature.

I suspect we should take a moment to agree on requirements/design before adding a feature like this: we should work as efficiently as possible, but not headlong.

aphillips · 2023-08-15T14:21:34Z

spec/formatting.md

+     bind the _name_ of the _option_ to the resolved value in the mapping.
+   - Otherwise, do not bind the _name_ of the _option_ to any value in the mapping.
+4. Determine the _expression_ _locale_.
+   If the _expression_ has a valid `@locale` _expression attribute_,


The problem here is that "valid" isn't clearly attached to @Locale or expression attribute.

If this were written upside down, it might be clearer, e.g.

Let expressionLocale be the message locale from the formatting context.
If the expression contains a well-formed @locale expression attribute, let expressionLocale be that value.

spec/formatting.md

aphillips · 2023-08-15T14:47:50Z

spec/formatting.md

+If a `@locale` has an invalid value,
+emit a Resolution error and ignore the _expression attribute_ in further processing.


We also need to be careful here. Does "invalid" include the language tag's validity? That is, are we requiring that the implementation check the tag's contents? I think that's a job for e.g. Intl.Locale.

If you made the change I suggested on the previous paragraph, some of this paragraph would become irrelevant (we'd have already said that the value was a sequence of language tags). Then this paragraph could say:

Suggested change

If a `@locale` has an invalid value,

emit a Resolution error and ignore the _expression attribute_ in further processing.

When `@locale` is assigned an empty value or the resolved value consists of anything other than a list of well-formed language tags, emit a Resolution error and omit the _expression attribute_ from further processing.

Then we probably need to add somewhere:

To resolve the value of an @locale...
If the value is empty, emit an error (I think?)
If a literal, split on comma, trim, check for well-formed tag
If a variable:
- if an array...
- else if a literal...

aphillips · 2023-08-15T14:52:42Z

spec/formatting.md

+If a `@locale` has an invalid value,
+emit a Resolution error and ignore the _expression attribute_ in further processing.
+
+Implementations MAY ignore all other _expression attributes_ during formatting.


I suspect this is wrong. Implementations should not be required to support attributes that they don't recognize or impute meaning. So they are "ignoring" them from that perspective. But they have already parsed them. If I implement @_foo in my functions and impute meaning to whatever I assign to @_foo at the expression level, then I'd like the implementation I'm running on to pass the value to me (pretty please), even though the impl doesn't know what it means. Otherwise "private use" expression attributes can't work (as anything other than translator instructions)

spec/syntax.md

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

mihnita · 2023-08-21T16:38:51Z

{{$start @can-reorder=no} is the beginning.}

This is not the intended XLIFF usage.
The use case for reorder is not allowing this kind of translation:

src: ...{ph1} .... {ph2}....
trg: ...{ph2} .... {ph1}....

The reason this makes sense in XLIFF is that it is an interchange format.
It is used to extract strings from other formats and represent them in a unified manner.
So "no reorder" would be used if the string in the original format was something like ".... %d ... %s ...".
(yes, bad i18n, but xliff should be able to represent that)

But I don't think it has a use case for MF2.

eemeli · 2023-08-21T17:15:51Z

Based on discussion during today's call, I'll be submitting a design document that this PR may be solving.

eemeli · 2024-01-12T20:03:02Z

Closing in favour of #592.

Define @attributes on expressions

8a1e3ac

eemeli added the syntax Issues related with syntax or ABNF label Aug 14, 2023

eemeli requested review from aphillips, stasm and mihnita August 14, 2023 21:43

aphillips requested changes Aug 14, 2023

View reviewed changes

aphillips added the formatting Issue pertains to the formatting section of the spec label Aug 14, 2023

eemeli commented Aug 15, 2023

View reviewed changes

spec/syntax.md Outdated Show resolved Hide resolved

eemeli commented Aug 15, 2023

View reviewed changes

spec/syntax.md Outdated Show resolved Hide resolved

Apply suggestions from code review

6f0a5fc

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

eemeli requested a review from aphillips August 15, 2023 01:22

This was referenced Aug 15, 2023

Constraints on @locale values #452

Closed

What is the namespace for expression attributes ? #453

Closed

aphillips requested changes Aug 15, 2023

View reviewed changes

Apply suggestions from code review

6e1e41b

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

aphillips mentioned this pull request Aug 19, 2023

Design Principle: Computational vs. Manual #60

Closed

eemeli marked this pull request as draft August 21, 2023 17:14

eemeli mentioned this pull request Aug 27, 2023

Add design doc for expression attributes #458

Merged

eemeli mentioned this pull request Oct 24, 2023

Make option values optional, defaulting to true #502

Closed

Merge branch 'main' into expression-attributes

9d0074f

aphillips mentioned this pull request Jan 4, 2024

Question about grammatical case #578

Open

stasm mentioned this pull request Jan 8, 2024

Allow options on closing markup placeholders #582

Merged

eemeli mentioned this pull request Jan 12, 2024

Add @attributes as reserved #592

Merged

eemeli closed this Jan 12, 2024

eemeli deleted the expression-attributes branch January 12, 2024 20:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define @attributes on expressions #450

Define @attributes on expressions #450

eemeli commented Aug 14, 2023 •

edited

Loading

macchiati commented Aug 14, 2023

aphillips left a comment

aphillips Aug 14, 2023 •

edited

Loading

eemeli Aug 15, 2023

aphillips Aug 21, 2023

aphillips Aug 14, 2023

eemeli Aug 15, 2023

aphillips Aug 15, 2023

aphillips Aug 14, 2023

eemeli Aug 15, 2023

aphillips Aug 15, 2023 •

edited

Loading

eemeli Aug 15, 2023

aphillips Aug 14, 2023

eemeli Aug 15, 2023

aphillips Aug 15, 2023

eemeli Aug 15, 2023

aphillips Aug 15, 2023

aphillips Aug 15, 2023

aphillips Aug 15, 2023

aphillips Aug 15, 2023 •

edited

Loading

mihnita commented Aug 21, 2023

eemeli commented Aug 21, 2023 •

edited

Loading

eemeli commented Jan 12, 2024

		To ensure future extensibility of this specification,
		all other _expression attributes_ with a _name_ not containing `-` MUST be ignored.

		The `attributes` of an `Expression` primarily represent translator hints,
		but could include ones such as `locale` that do have an impact on _message_ formatting.

		If a `@locale` has an invalid value,
		emit a Resolution error and ignore the _expression attribute_ in further processing.

	If a `@locale` has an invalid value,
	emit a Resolution error and ignore the _expression attribute_ in further processing.
	When `@locale` is assigned an empty value or the resolved value consists of anything other than a list of well-formed language tags, emit a Resolution error and omit the _expression attribute_ from further processing.

Define @attributes on expressions #450

Define @attributes on expressions #450

Conversation

eemeli commented Aug 14, 2023 • edited Loading

macchiati commented Aug 14, 2023

aphillips left a comment

Choose a reason for hiding this comment

aphillips Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aphillips Aug 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aphillips Aug 15, 2023 • edited Loading

Choose a reason for hiding this comment

mihnita commented Aug 21, 2023

eemeli commented Aug 21, 2023 • edited Loading

eemeli commented Jan 12, 2024

eemeli commented Aug 14, 2023 •

edited

Loading

aphillips Aug 14, 2023 •

edited

Loading

aphillips Aug 15, 2023 •

edited

Loading

aphillips Aug 15, 2023 •

edited

Loading

eemeli commented Aug 21, 2023 •

edited

Loading