-
-
Notifications
You must be signed in to change notification settings - Fork 36
(Design) Code Mode Introducer choice #521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
700d38b
26536c4
18d9ae8
c412138
564516f
118b4d6
d2e87d8
4420fa1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,253 @@ | ||||||||||||||
# Design Proposal: Choosing a Code Mode Introducer | ||||||||||||||
|
||||||||||||||
Status: **Proposed** | ||||||||||||||
|
||||||||||||||
<details> | ||||||||||||||
<summary>Metadata</summary> | ||||||||||||||
<dl> | ||||||||||||||
<dt>Contributors</dt> | ||||||||||||||
<dd>@aphillips</dd> | ||||||||||||||
<dt>First proposed</dt> | ||||||||||||||
<dd>2023-11-10</dd> | ||||||||||||||
<dt>Pull Requests</dt> | ||||||||||||||
<dd>#000</dd> | ||||||||||||||
</dl> | ||||||||||||||
</details> | ||||||||||||||
|
||||||||||||||
## Objective | ||||||||||||||
|
||||||||||||||
_What is this proposal trying to achieve?_ | ||||||||||||||
|
||||||||||||||
It must be possible to reliably parse messages. | ||||||||||||||
|
||||||||||||||
Our current syntax features unquoted patterns for simple messages | ||||||||||||||
and unquoted code tokens with quoted patterns for complex messages. | ||||||||||||||
Determining whether a message will have code tokens requires some | ||||||||||||||
special character sequence, either part of the code itself or | ||||||||||||||
prepended to the message. | ||||||||||||||
Comment on lines
+25
to
+27
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
(Strictly speaking, it doesn't require that, since you could just scan for the first code token -- but that's undesirable since it requires arbitrary lookahead.) |
||||||||||||||
This proposal examines the options for determining code mode. | ||||||||||||||
|
||||||||||||||
## Background | ||||||||||||||
|
||||||||||||||
_What context is helpful to understand this proposal?_ | ||||||||||||||
|
||||||||||||||
## Use-Cases | ||||||||||||||
|
||||||||||||||
_What use-cases do we see? Ideally, quote concrete examples._ | ||||||||||||||
|
||||||||||||||
As a developer, I want to create messages with the minimal amount of special syntax. | ||||||||||||||
I don't want to have to type additional characters that add no value. | ||||||||||||||
I want the syntax to be logical and as consistent as possible. | ||||||||||||||
|
||||||||||||||
As a translator, I don't want to have to learn special syntax to support features such as declarations. | ||||||||||||||
|
||||||||||||||
As a user, I want my messages to be robust. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Define what "robust" means. Does it mean that the visual diff between the simple quoted pattern & a non-simple pattern is minimal? Or does it mean that there is an unambiguous 1:1 correspondence between a message (including its patterns and behavior logic) and its syntactical text representation? Or something else? |
||||||||||||||
Minor edits and changes should not result in syntax errors. | ||||||||||||||
|
||||||||||||||
As a user, I want to be able to see which messages are complex at a glance | ||||||||||||||
and to parse messages into their component parts visually as easily as possible. | ||||||||||||||
|
||||||||||||||
## Requirements | ||||||||||||||
|
||||||||||||||
_What properties does the solution have to manifest to enable the use-cases above?_ | ||||||||||||||
|
||||||||||||||
## Constraints | ||||||||||||||
|
||||||||||||||
_What prior decisions and existing conditions limit the possible design?_ | ||||||||||||||
|
||||||||||||||
Some of the options use a new sigil as part of the introducer. | ||||||||||||||
For various reasons, `#` has been used recently as a placeholder for this sigil. | ||||||||||||||
There are concerns that this character is not suitable, since it is used as a comment | ||||||||||||||
introducer in a number of formats. | ||||||||||||||
See for example [#520](https://github.com/unicode-org/message-format-wg/issues/520). | ||||||||||||||
The actual sigil used needs to be an ASCII character in the reserved or private use | ||||||||||||||
set (with syntax adjustments if we use up a private-use one). | ||||||||||||||
Most of the options below have been changed to use `^`, using | ||||||||||||||
Apple's experimental syntax as a model for sigil choice. | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure what Apple's experimental syntax is -- can you add a link? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got this from @grhoten's presentation at the Unicode Technical Workshop this past week. I don't have a link handy, but one will be in the offing I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remember watching https://developer.apple.com/videos/play/wwdc2023/10153/ which demoed a syntax which used |
||||||||||||||
|
||||||||||||||
It should be noted that an introducer sigil should be as rare as possible in normal text. | ||||||||||||||
This tends to run against common punctuation marks `&`, `%`, `!`, and `?`. | ||||||||||||||
|
||||||||||||||
```abnf | ||||||||||||||
reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~" | ||||||||||||||
private-start = "^" / "&" | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
## Proposed Design | ||||||||||||||
|
||||||||||||||
_Describe the proposed solution. Consider syntax, formatting, errors, registry, tooling, interchange._ | ||||||||||||||
|
||||||||||||||
We need to choose one of these (or another option not yet considered). | ||||||||||||||
Presentation at UTW did not produce any opinions. | ||||||||||||||
|
||||||||||||||
Based on the pro/cons below, I would suggest Option D is possibly the best option? | ||||||||||||||
|
||||||||||||||
## Alternatives Considered | ||||||||||||||
|
||||||||||||||
_What other solutions are available?_ | ||||||||||||||
_How do they compare against the requirements?_ | ||||||||||||||
_What other properties they have?_ | ||||||||||||||
|
||||||||||||||
There are the following designs being considered: | ||||||||||||||
|
||||||||||||||
### Option A. Use Pattern Quotes for Messages (current design) | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could also consider slight variations of Option A which aren't exactly the current There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good callout, although it's still a lot of closing syntax |
||||||||||||||
|
||||||||||||||
Complex messages are quoted with double curly brackets. | ||||||||||||||
The closing curly brackets might be optional. | ||||||||||||||
|
||||||||||||||
Sample pattern: | ||||||||||||||
``` | ||||||||||||||
{{ | ||||||||||||||
input {$var} | ||||||||||||||
match {$var} | ||||||||||||||
when * {{Pattern}} | ||||||||||||||
}} | ||||||||||||||
``` | ||||||||||||||
Sample quoted pattern with no declarations or match: | ||||||||||||||
``` | ||||||||||||||
{{{{Pattern}}}} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Uses a sigil `{` already present in the syntax | ||||||||||||||
- No additional escapes | ||||||||||||||
- Consistent with other parts of the syntax? | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- Somewhat verbose | ||||||||||||||
- Closing portion of the syntax adds no value; | ||||||||||||||
could be a source of unintentional syntax errors | ||||||||||||||
- Messages commonly end with four `}}}}` | ||||||||||||||
|
||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||
> [!NOTE] Other enclosing sequences are also an option, notably `{%...%}` (or similar). | ||||||||||||||
> This does reduce the number of curly brackets in a row. | ||||||||||||||
|
||||||||||||||
### Option B. Use a Sigil | ||||||||||||||
|
||||||||||||||
Complex messages start with a special sigil character. | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
^input {$var} | ||||||||||||||
match {$var} | ||||||||||||||
when * {{Pattern}} | ||||||||||||||
``` | ||||||||||||||
Sample quoted pattern with no declarations or match: | ||||||||||||||
``` | ||||||||||||||
^{{Pattern}} | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In all options not using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could, but then we'd be into having multiple equivalent representations? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, we already have two anyways: I think the question to ask first is: given that It would be another way of ensuring leading and trailing whitespace is preserved, using only the MF2 syntax, regardless of what the host format offers. Consider: <string xml:space="preserve">" Hello "</string> <string>{{{{ Hello }}}}</string> There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Depending on how you look at this, it would either mean supporting for a pattern within a pattern, or adding a third syntax under the trenchcoat. I'd really rather not. Either we leave the external spaces for the surrounding syntax, or we don't. Let's not do both if we don't need to. |
||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Requires minimum additional typing | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- Requires an additional sigil | ||||||||||||||
- Requires an additional escape for simple pattern start | ||||||||||||||
- Has no other purpose in the syntax | ||||||||||||||
|
||||||||||||||
### Option C. Use a Double Sigil | ||||||||||||||
|
||||||||||||||
Like Option B, except the sigil is doubled. | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
^^input {$var} | ||||||||||||||
match {$var} | ||||||||||||||
when * {{Pattern}} | ||||||||||||||
``` | ||||||||||||||
Sample quoted pattern with no declarations or match: | ||||||||||||||
``` | ||||||||||||||
^^{{Pattern}} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Less likely to conflict with a simple pattern | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- Requires an additional sigil | ||||||||||||||
- Requires an additional escape for simple pattern start | ||||||||||||||
- Has no other purpose in the syntax | ||||||||||||||
|
||||||||||||||
### Option D. Sigilized Keywords | ||||||||||||||
|
||||||||||||||
Instead of quoting the message, adds a sigil to keywords that | ||||||||||||||
start statements, that is, `.input`, `.local` and `.match`. | ||||||||||||||
The keyword `when` might be considered separately. | ||||||||||||||
|
||||||||||||||
The sigil used was changed to `.` as a result of the 2023-11-13 teleconference | ||||||||||||||
discussion of sigils. Others considered were `~`, `@`, `&`, and `%`. | ||||||||||||||
Originally this was `#` for similarity to `#define` (etc.) in other environments. | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
.input {$var} | ||||||||||||||
.local $foo = {$bar} | ||||||||||||||
.match {$var} | ||||||||||||||
when * {{Pattern}} | ||||||||||||||
``` | ||||||||||||||
Sample quoted pattern with no declarations or match: | ||||||||||||||
``` | ||||||||||||||
{{Pattern}} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Sigil is part of the keyword, not something separate; note that the | ||||||||||||||
need for escaping is reduced by attaching the sigil to the keyword, | ||||||||||||||
since `.input` or `.local` or `.match` are unlikely to be message starters | ||||||||||||||
- Requires minimum additional typing | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't say "minimum" here. Option B requires the absolute minimum additional typing, since it's just adding one character per message. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, Option D adds zero characters. That's actually the minimum ;-). I could say "Requires no additional typing". Also, note that this option avoids some of the iteration hazard of other options. |
||||||||||||||
- Adds no characters to messages that consist of only a quoted pattern; | ||||||||||||||
that is, quoting the pattern consists only of adding the `{{`/`}}` quotes | ||||||||||||||
- Maybe makes single-line messages easier to parse visually??? | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- Requires an additional sigil | ||||||||||||||
- Requires an additional escape for simple pattern start | ||||||||||||||
|
||||||||||||||
### Option E. Special Sequence | ||||||||||||||
|
||||||||||||||
Like Option A except the sequence is closed locally (not at the end of the message). | ||||||||||||||
The suggested sequence is `{#}` but might be `{}` or `{{}}` also. | ||||||||||||||
|
||||||||||||||
``` | ||||||||||||||
{^}input {$var} | ||||||||||||||
match {$var} | ||||||||||||||
when * {{Pattern}} | ||||||||||||||
``` | ||||||||||||||
Sample quoted pattern with no declarations or match: | ||||||||||||||
``` | ||||||||||||||
{^}{{Pattern}} | ||||||||||||||
``` | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Less likely to conflict with a simple pattern | ||||||||||||||
- Requires no additional sigil | ||||||||||||||
- Requires no additional escape | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- Has no other purpose in the syntax | ||||||||||||||
- Looks like something should happen inside it | ||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure what that means? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||||||||||
- Most additional typing | ||||||||||||||
|
||||||||||||||
### Option F. Preamble | ||||||||||||||
|
||||||||||||||
In this option, all declarations are placed in a dedicated block at the beginning of the message. | ||||||||||||||
The preamble is the "front-matter" of the message, containing the message's logic. | ||||||||||||||
`when` clauses are not part of the preamble. | ||||||||||||||
|
||||||||||||||
The preamble can be delimited with `{% ... %}`: | ||||||||||||||
|
||||||||||||||
{%input {$var} match {$var}%} when * {{Pattern}} | ||||||||||||||
|
||||||||||||||
Alternatively, it can be delimited with a new kind of delimiter, to make it visually distinct from placeholders and patterns: | ||||||||||||||
|
||||||||||||||
[[input {$var} match {$var}]] when * {{Pattern}} | ||||||||||||||
|
||||||||||||||
We could also consider dropping the `when` keywords: | ||||||||||||||
|
||||||||||||||
[[input {$var} match {$var}]] * {{Pattern}} | ||||||||||||||
|
||||||||||||||
Pros: | ||||||||||||||
- Provides a clear conceptual distinction between declarations and variants. | ||||||||||||||
- Visually, all code is grouped together. | ||||||||||||||
- Unnests variant patterns. | ||||||||||||||
|
||||||||||||||
Cons: | ||||||||||||||
- If `[[ ... ]]` is used to delimit the preamble, it will require `[[` to be escaped at the beginning of simple patterns. | ||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Reliable" is vague. Can this be framed as a property of the grammar? (For example: "The grammar for messages must be LL(k), so that it can be parsed without arbitrary lookahead.")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will clarify. This has to do with determining whether a message is simple or complex (and only that). Any given message must produce a consistent result for this.