Skip to content

Use keywords in syntax #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 28, 2022
Merged

Use keywords in syntax #287

merged 2 commits into from
Jun 28, 2022

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Jun 22, 2022

This proposal goes all-in on considering MF2 syntax as "code", and adds three keywords. As something of a side effect, this requires dropping the Plain construction.

let

Used to define the value of a local variable:

let $countInt = {$count :number maximumFractionDigits=0}

The syntax here is the same as previously in the preamble, with the addition of the let. The keyword is not strictly speaking necessary, but makes it easier to see that an assignment is taking place.

match

Defines the selectors for a multivariant message:

match {$countInt} {$user :gender}

Using match aligns with other pattern matching syntaxes, which more closely than switch maps to what we're doing in particular when selecting with more than one selector. Using select here wouldn't really work semantically.

when

Introduces a variant:

when 1 masc [...]
when many fem [...]
when * * [...]

Another possible alternative here would be case, but that somewhat overloads a term that we commonly use to refer to grammatical case.

Put together, these allow for a message that would with the current syntax be represented as

$countInt = {$count :number maximumFractionDigits=0}
{$user :gender}
1 masc [...]
many fem [...]
* * [...]

to instead read as:

let $countInt = {$count :number maximumFractionDigits=0}
match {$countInt} {$user :gender}
when 1 masc [...]
when many fem [...]
when * * [...]

While the latter clearly has more characters, it's much easier to figure out from it what's happening.

Dropping Plain

Because this PR makes it possible for a message to start with let or match, it makes it practically speaking impossible to continue supporting the undelimited "plain" form of simple messages while maintaining the language as LL(1). Therefore, it's dropped.

Practically speaking, it should still be possible to write an efficient detector for plain vs. MF2 messages, should it be desirable in some spec or environment built on top of the MF2 spec (such as the message resource syntax):

  1. If a message starts with let, match or [, it may be MF2. If it is, in the first two cases, it will always contain a { within the first few tokens.
  2. Plain messages may not start with a [ or contain a {.

As one benefit, dropping Plain means that MF2 will not provide two different ways to represent the same simple message ([Hello] vs. Hello).

Never mind the delimiters

This PR uses the current [...] syntax in its examples as that's what's currently on the develop branch. It is not meant to offer any opinion on #255 and the option of using {...} instead. However, it is relevant to note that this PR does effectively make it necessary to have some delimiters for message contents, concluding the discussion in #275.

Still not Turing-complete

Given that this change makes MF2 look more like code, it's good to note that we're still at most defining a deterministic finite-state machine. Message processing does not have access to anything like a stack, and cannot modify its own state; formatting calls are independent of each other. This holds even with custom formatting functions, as long as they are require to be pure functions.

@eemeli eemeli added the syntax Issues related with syntax or ABNF label Jun 22, 2022
@eemeli eemeli requested review from stasm and mihnita June 22, 2022 12:48
1 [You have one notification.]
* [You have {$count} notifications.]
match {$count :number}
when 1 [You have one notification.]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't care for the use of when proposed here. I don't see what value it adds. All of the selectors are clearly part of the match statement. The when is just extra noise which actually obscures the values.

Note that I still favor delimiting the selectors and the literal. The selector delimiter functions the same way that when does--identifying that there is an set of values in the match.

The single-line example (below, line 143) could then be like:

match {:platform} [windows] {Settings} [*] {Preferences}

Or (without delims):

match {:platform} windows {Settings} * {Preferences}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two benefits to when:

  1. It allows us to add more keywords in the future that come after the last variant, e.g. meta or attribute. Usually, programming languages solve this by putting the body of the statement inside a curly-brace-delimited block, but I think we said we didn't want to risk people forgetting the closing brace.

  2. (This one is more subjective) It creates a visual correspondence between the match and each variant, in particular in case of multiline messages.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I still favor delimiting the selectors and the literal. The selector delimiter functions the same way that when does--identifying that there is an set of values in the match.

Ack here as well. I'm not opposed to delimiting variant keys, but it's true that I'm still hoping that we can use square brackets [...] for delimiting patterns (#255).

Technically, I think just relying on order here is good enough: match <selector list> <variant list> <pattern value> <variant list> <pattern value> is OK from the parser's point of view, unless we anticipate some other productions to follow all variants in the future (see point 1 in my comment just above).

1. The syntax should define as few special characters and sigils as possible.

## Overview & Examples

### Simple Messages

A simple message without any variables does not need any syntax:
All messages, including simple ones, need `[…]` delimiters:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think {} brackets are better than [] ones because of the need to quote inside the literal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. The wording here is just being consistent with the rest of the doc in its current state. I know that #255 is still very much open.

PlainChar ::= AnyChar - ('{' | '}')
PlainStart ::= PlainChar - ('[' | '$' | WhiteSpace)
PlainEnd ::= PlainChar - WhiteSpace
Declaration ::= 'let' WhiteSpace Variable '=' '{' Expression '}'
Copy link
Collaborator

@stasm stasm Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let is clear enough, but I think there's an opportunity here to use a keyword that directly relates to the name that we choose for these let bindings (#248). This would be self-explanatory and easier to learn and search for. For example:

local $foo = {$count :number}   // ...and call them "locals"
alias $foo = {$count :number}   // ...and call them "aliases"
macro $foo = {$count :number}   // ...and call them "macros"
decl $foo = {$count :number}    // ...and call them "declarations"
expr $foo = {$count :number}    // ...and call them "named expressions"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think let is fine, as it says "immutability" (if you come from JS, if not, then "tough luck")

From your list I also like macro and alias, as they are suggest "things you declare for convenience, something shorter, that you use to not repeat again and again something long"

The rest (decl and expr) are too generic.

And I think that macro and alias are more familiar than let, for a lot more people than JS devs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is now closed, let's use #289 to continue this topic.

Copy link
Collaborator

@mihnita mihnita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, with the understanding that this is not final, and to help with the tech preview.

I still have doubts about '[' vs '{'

And I'm intrigued by the idea to use keywords.
We should see what users say.

But I think it is an improvement because it makes everything start in code mode (no Plain)

@romulocintra
Copy link
Collaborator

Consensus on MFWG Plenary waiting for @aphillips for the final approval then merge - this is independent of delimiters around variant keys and selectors - this PR it's only about adding keywords

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on discussion here and elsewhere I'm good to proceed with these changes.

eemeli and others added 2 commits June 28, 2022 09:31
Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org>
Co-authored-by: Mihai Nita <mnita@google.com>
@eemeli eemeli merged commit cb02ef0 into unicode-org:develop Jun 28, 2022
@eemeli eemeli deleted the split-preamble branch June 28, 2022 08:20
echeran pushed a commit that referenced this pull request Sep 20, 2022
Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org>
Co-authored-by: Mihai Nita <mnita@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
syntax Issues related with syntax or ABNF
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants