Placeholders: What sigil(s) indicate them? #269

gibson042 · 2022-05-16T14:28:06Z

develop syntax uses {…}, but the broader software ecosystem seems to have settled on a paradigm in which interpolated parts of text are indicated by either ${…} (JavaScript template literals and Mako templates) or {{…}} (Mustache, Jinja(2) and Angular).

Unless there is a clear pre-existing convention for single braces within the context of internationalization, I think it would be wise to conform with that external paradigm rather than diverging from it.

The text was updated successfully, but these errors were encountered:

stasm · 2022-05-16T16:49:17Z

I wonder if these same arguments would be good reasons to avoid picking ${...} or {{...}} for MF placeholders, in order to avoid friction and conflicts with the programming languages in which MF strings will be embedded. #263 has the same motivation, but for literals.

mihnita · 2022-05-16T17:26:33Z

Almost every combinations is used somewhere :-)
https://en.wikipedia.org/wiki/String_interpolation

But one might make the argument that we should do something that is slightly different than others.
Reason 1: detection. So that when one finds a string in a generic localization file (.properties, .json, .xml) there is a way to tell that this is MF2
Reason 2: avoid conflict. There is a risk that the underlying platform (python, whatever) might interpret these placeholders before they get to MF2

gibson042 · 2022-05-16T18:08:35Z

I don't think either of those reasons hold up. The friction/conflict justification applies equally to any kind of string interpolation, but that has not prevented modern templating systems from settling in large part on two common patterns (and even there, with one appearing to be approaching dominance). As for the detection justification, the signal seems too weak to be useful—unless it is absolutely explicit (e.g., a "MessageFormat2:" prefix), there will be a large number of false positives (e.g., Lacking enclosing brackets, {$this} is not a Message.) and a moderate number of false negatives (since not every message will even have patterns and/or placeables).

My opinion is that being different from the broader ecosystem for the sake of being different is harmful rather than helpful. However, that does not preclude being different for a supportable reason such as "conforming with a clear pre-existing convention in the narrower scope of internationalization".

eemeli · 2022-05-20T11:45:27Z

One option here might be to do something close to what Jinja does:

There are a few kinds of delimiters. The default Jinja delimiters are configured as follows:

{% ... %} for Statements

{{ ... }} for Expressions to print to the template output

{# ... #} for Comments not included in the template output

Specifically, we could consider the start of a placeholder to always be two characters, for example: {$ ... }, {: ... }, {/ ... }, {[ ... ]}, {{ ... }}. In the current syntax, this would require two changes:

Disallow whitespace between the initial { and any subsequent sigil.
Reconsider the markup element syntax, e.g. using {+link} ... {-link} (and if MarkupEmpty is added, {+-link} or {+link-}).

With that change, a { followed by a word character would not need to be considered as syntax. In addition to any sigils that are chosen for initial use, others would need to be reserved for later expansion.

markusicu · 2022-06-07T23:56:21Z

With that change, a { followed by a word character would not need to be considered as syntax. In addition to any sigils that are chosen for initial use, others would need to be reserved for later expansion.

I am skeptical about allowing { followed by a non-syntax character just being a literal character. I did that in ICU MessageFormat with the ASCII apostrophe, because it's the best I could do to make normal text mostly work (previously a pair of apostrophes always enclosed literal text, as a terrible kind of escaping syntax), but it still confuses developers.

Disallow whitespace between the initial { and any subsequent sigil.

This I like. I generally favor not allowing white space in more places than necessary.

It means that attempting to use unescaped curly braces as literal text yields a fail-fast syntax error. I don't think we need to complicate the syntax beyond that.

macchiati · 2022-06-08T04:32:49Z

+1

…

On Tue, Jun 7, 2022, 16:56 Markus Scherer ***@***.***> wrote: With that change, a { followed by a word character would not need to be considered as syntax. In addition to any sigils that are chosen for initial use, others would need to be reserved for later expansion. I am skeptical about allowing { followed by a non-syntax character just being a literal character. I did that in ICU MessageFormat with the ASCII apostrophe, because it's the best I could do to make normal text mostly work (previously a pair of apostrophes always enclosed literal text, as a terrible kind of escaping syntax), but it still confuses developers. Disallow whitespace between the initial { and any subsequent sigil. This I like. I generally favor not allowing white space in more places than necessary. It means that attempting to use unescaped curly braces as literal text yields a fail-fast syntax error. I don't think we need to complicate the syntax beyond that. — Reply to this email directly, view it on GitHub <#269 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACJLEMBZSJFQ7HY7EFDP3P3VN7OTHANCNFSM5WBV5L7Q> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

aphillips · 2022-06-08T14:10:37Z

I like @eemeli's suggestion of Jinja-like syntax. I think it's easier to have a consistent outer marker ({/}) and then use an inner marker (sigal) to indicate type. This reduces the characters that require escaping to just the curly brackets (or at least the opening bracket)

mihnita · 2022-06-09T17:47:19Z

Disallow whitespace between the initial { and any subsequent sigil.

I think I am OK with this, but should make sure we describe it properly (to be sure we are talking about the same thing)

For me {% and {$ and {# etc are not { + sigil.
They are complete, standalone tokens (in parser terms).

So yes, they don't allow spaces.
Same as a byte-shift a >> 2 in C (and others) is not described in terms like "two greater-than signs, but disallow whitespace in between. It is one single thing, the "shift" operator.

mihnita · 2022-06-09T17:52:11Z

And I am with Marcus on allowing { to be OK when not followed by a non-syntax character.
This seems convenient, but adds friction because it is inconsistent.
Now instead of "always escape {" the rule becomes a lot more complicated (escape { if X, but there is no need to escape if Y)
We save typing (one character) at the price of adding extra rules that we now need to understand and remember.

Same as the ; in JavaScript.
I am happy with "always use it", and I am happy with "never use it"
But it should not be "most of the times don't use it, but be careful that if you don't use it in situation A, B, C then it's a problem"

eemeli added the syntax Issues related with syntax or ABNF label May 16, 2022

eemeli mentioned this issue May 20, 2022

Add self-closing MarkupEmpty element #273

Closed

eemeli changed the title ~~Placeables: What sigil(s) indicate them?~~ Placeholders: What sigil(s) indicate them? Jun 9, 2022

eemeli mentioned this issue Jun 9, 2022

Add +start and -end sigils for markup elements #283

Merged

romulocintra closed this as completed Jul 18, 2022

gibson042 mentioned this issue Feb 19, 2023

Pick a delimiter for literals other than the double quote #263

Closed

echeran mentioned this issue Feb 24, 2023

Clarify that standalone markup is permitted. #356

Closed

stasm mentioned this issue Sep 23, 2023

Add message parse mode (code vs text) design doc #474

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Placeholders: What sigil(s) indicate them? #269

Placeholders: What sigil(s) indicate them? #269

gibson042 commented May 16, 2022

stasm commented May 16, 2022 •

edited

Loading

mihnita commented May 16, 2022

gibson042 commented May 16, 2022

eemeli commented May 20, 2022

markusicu commented Jun 7, 2022

macchiati commented Jun 8, 2022 via email

aphillips commented Jun 8, 2022

mihnita commented Jun 9, 2022

mihnita commented Jun 9, 2022

Placeholders: What sigil(s) indicate them? #269

Placeholders: What sigil(s) indicate them? #269

Comments

gibson042 commented May 16, 2022

stasm commented May 16, 2022 • edited Loading

mihnita commented May 16, 2022

gibson042 commented May 16, 2022

eemeli commented May 20, 2022

markusicu commented Jun 7, 2022

macchiati commented Jun 8, 2022 via email

aphillips commented Jun 8, 2022

mihnita commented Jun 9, 2022

mihnita commented Jun 9, 2022

stasm commented May 16, 2022 •

edited

Loading