From 700d38b61c29a790216afe703a2d0e8789e0a39c Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 10 Nov 2023 12:24:59 -0800 Subject: [PATCH 1/8] (Design) Code Mode Introducer choice This design doc attempts to capture the options for beautifying the code mode introducer. --- exploration/code-mode-introducer.md | 194 ++++++++++++++++++++++++++++ 1 file changed, 194 insertions(+) create mode 100644 exploration/code-mode-introducer.md diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md new file mode 100644 index 0000000000..b29952d536 --- /dev/null +++ b/exploration/code-mode-introducer.md @@ -0,0 +1,194 @@ +# Design Proposal: Choosing a Code Mode Introducer + +Status: **Proposed** + +
+ Metadata +
+
Contributors
+
@aphillips
+
First proposed
+
2023-11-10
+
Pull Requests
+
#000
+
+
+ +## Objective + +_What is this proposal trying to achieve?_ + +It must be possible to reliably parse messages. + +Our current syntax features unquoted patterns for simple messages +and unquoted code tokens with quoted patterns for complex messages. +Determining whether a message will have code tokens requires some +special character sequence, either part of the code itself or +prepended to the message. +This proposal examines the options for determining code mode. + +## Background + +_What context is helpful to understand this proposal?_ + +## Use-Cases + +_What use-cases do we see? Ideally, quote concrete examples._ + +As a developer, I want to create messages with the minimal amount of special syntax. +I don't want to have to type additional characters that add no value. +I want the syntax to be logical and as consistent as possible. + +As a translator, I don't want to have to learn special syntax to support features such as declarations. + +As a user, I want my messages to be robust. + +As a user, I want to be able to see which messages are complex at a glance +and to parse messages into their component parts visually as easily as possible. + +## Requirements + +_What properties does the solution have to manifest to enable the use-cases above?_ + +## Constraints + +_What prior decisions and existing conditions limit the possible design?_ + +## Proposed Design + +_Describe the proposed solution. Consider syntax, formatting, errors, registry, tooling, interchange._ + +We need to choose one of these (or another option not yet considered). +Presentation at UTW did not produce any opinions. + +Based on the pro/cons below, I would suggest Option D is possibly the best option? + +## Alternatives Considered + +_What other solutions are available?_ +_How do they compare against the requirements?_ +_What other properties they have?_ + +There are the following designs being considered: + +### Option A. Use Pattern Quotes for Messages (current design) + +Complex messages are quoted with double curly brackets. +The closing curly brackets might be optional. + +Sample pattern: +``` +{{ +input {$var} +match {$var} +when * {{Pattern}} +}} +``` +Sample quoted pattern with no declarations or match: +``` +{{{{Pattern}}}} +``` + +Pros: +- Uses a sigil `{` already present in the syntax +- No additional escapes +- Consistent with other parts of the syntax? +Cons: +- Somewhat verbose +- Closing portion of the syntax adds no value; + could be a source of unintentional syntax errors +- Messages commonly end with four `}}}}` + +### Option B. Use a Sigil + +Complex messages start with a special sigil character. + +``` +#input {$var} +match {$var} +when * {{Pattern}} +``` +Sample quoted pattern with no declarations or match: +``` +#{{Pattern}} +``` + +Pros: +- Requires minimum additional typing +Cons: +- Requires an additional sigil +- Requires an additional escape for simple pattern start +- Has no other purpose in the syntax + +### Option C. Use a Double Sigil + +Like Option B, except the sigil is doubled. + +``` +##input {$var} +match {$var} +when * {{Pattern}} +``` +Sample quoted pattern with no declarations or match: +``` +##{{Pattern}} +``` + +Pros: +- Less likely to conflict with a simple pattern +Cons: +- Requires an additional sigil +- Requires an additional escape for simple pattern start +- Has no other purpose in the syntax + +### Option D. Sigilized Keywords + +Instead of quoting the message, adds a sigil to keywords that +start statements, that is, `#input`, `#local` and `#match`. +The keyword `when` might be considered separately. + +``` +#input {$var} +#local $foo = {$bar} +#match {$var} +when * {{Pattern}} +``` +Sample quoted pattern with no declarations or match: +``` +{{Pattern}} +``` + +Pros: +- Sigil is part of the keyword, not something separate +- Requires minimum additional typing +- Adds no characters to messages that consist of only a quoted pattern; + that is, quoting the pattern consists only of adding the `{{`/`}}` quotes +- Maybe makes single-line messages easier to parse visually??? +Cons: +- Requires an additional sigil +- Requires an additional escape for simple pattern start + +### Option E. Special Sequence + +Like Option A except the sequence is closed locally (not at the end of the message). +The suggested sequence is `{#}` but might be `{}` or `{{}}` also. + +``` +{#}input {$var} +match {$var} +when * {{Pattern}} +``` +Sample quoted pattern with no declarations or match: +``` +{#}{{Pattern}} +``` + +Pros: +- Less likely to conflict with a simple pattern +- Requires no additional sigil +- Requires no additional escape +Cons: +- Has no other purpose in the syntax +- Looks like something should happen inside it +- Most additional typing + From 26536c4641daaad889d8db18f3ee56e89ca0b300 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 10 Nov 2023 12:51:25 -0800 Subject: [PATCH 2/8] Update exploration/code-mode-introducer.md Co-authored-by: Richard Gibson --- exploration/code-mode-introducer.md | 1 + 1 file changed, 1 insertion(+) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index b29952d536..85ce356d90 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -93,6 +93,7 @@ Pros: - Uses a sigil `{` already present in the syntax - No additional escapes - Consistent with other parts of the syntax? + Cons: - Somewhat verbose - Closing portion of the syntax adds no value; From 18d9ae83330b6b49b5d6deef4761ac3b5e558818 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 10 Nov 2023 12:52:24 -0800 Subject: [PATCH 3/8] Fix "Cons" blank lines --- exploration/code-mode-introducer.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index 85ce356d90..7ac13a2af7 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -116,6 +116,7 @@ Sample quoted pattern with no declarations or match: Pros: - Requires minimum additional typing + Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start @@ -137,6 +138,7 @@ Sample quoted pattern with no declarations or match: Pros: - Less likely to conflict with a simple pattern + Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start @@ -165,6 +167,7 @@ Pros: - Adds no characters to messages that consist of only a quoted pattern; that is, quoting the pattern consists only of adding the `{{`/`}}` quotes - Maybe makes single-line messages easier to parse visually??? + Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start @@ -188,6 +191,7 @@ Pros: - Less likely to conflict with a simple pattern - Requires no additional sigil - Requires no additional escape + Cons: - Has no other purpose in the syntax - Looks like something should happen inside it From c4121386d2ea54e9e41b4be317c101f49a88faee Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Fri, 10 Nov 2023 15:59:08 -0800 Subject: [PATCH 4/8] Add notes about the sigil choice and change the exemplar sigil --- exploration/code-mode-introducer.md | 37 ++++++++++++++++++++++++----- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index 7ac13a2af7..150bde1533 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -42,6 +42,7 @@ I want the syntax to be logical and as consistent as possible. As a translator, I don't want to have to learn special syntax to support features such as declarations. As a user, I want my messages to be robust. +Minor edits and changes should not result in syntax errors. As a user, I want to be able to see which messages are complex at a glance and to parse messages into their component parts visually as easily as possible. @@ -54,6 +55,24 @@ _What properties does the solution have to manifest to enable the use-cases abov _What prior decisions and existing conditions limit the possible design?_ +Some of the options use a new sigil as part of the introducer. +For various reasons, `#` has been used recently as a placeholder for this sigil. +There are concerns that this character is not suitable, since it is used as a comment +introducer in a number of formats. +See for example #520. +The actual sigil used needs to be an ASCII character in the reserved or private use +set (with syntax adjustments if we use up a private-use one). +Most of the options below have been changed to use `^`, using +Apple's experimental syntax as a model for sigil choice. + +It should be noted that an introducer sigil should be as rare as possible in normal text. +This tends to run against common punctuation marks `&`, `%`, `!`, and `?`. + +```abnf +reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~" +private-start = "^" / "&" +``` + ## Proposed Design _Describe the proposed solution. Consider syntax, formatting, errors, registry, tooling, interchange._ @@ -105,13 +124,13 @@ Cons: Complex messages start with a special sigil character. ``` -#input {$var} +^input {$var} match {$var} when * {{Pattern}} ``` Sample quoted pattern with no declarations or match: ``` -#{{Pattern}} +^{{Pattern}} ``` Pros: @@ -127,13 +146,13 @@ Cons: Like Option B, except the sigil is doubled. ``` -##input {$var} +^^input {$var} match {$var} when * {{Pattern}} ``` Sample quoted pattern with no declarations or match: ``` -##{{Pattern}} +^^{{Pattern}} ``` Pros: @@ -172,19 +191,25 @@ Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start +> [!Note] Unlike the other options, Option D is presented with `#` as the sigil. +> That doesn't mean that this should be the sigil. +> However, `#` was originally chosen for similarity in declarations to `#define` and `#import` +> in various programming languages. +> It is kept "as-is" here. + ### Option E. Special Sequence Like Option A except the sequence is closed locally (not at the end of the message). The suggested sequence is `{#}` but might be `{}` or `{{}}` also. ``` -{#}input {$var} +{^}input {$var} match {$var} when * {{Pattern}} ``` Sample quoted pattern with no declarations or match: ``` -{#}{{Pattern}} +{^}{{Pattern}} ``` Pros: From 564516f197a5aa0dd0186ad1166135d465f8e69d Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Sat, 11 Nov 2023 07:32:35 -0800 Subject: [PATCH 5/8] Update exploration/code-mode-introducer.md Co-authored-by: Tim Chevalier --- exploration/code-mode-introducer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index 150bde1533..f51e131e5d 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -59,7 +59,7 @@ Some of the options use a new sigil as part of the introducer. For various reasons, `#` has been used recently as a placeholder for this sigil. There are concerns that this character is not suitable, since it is used as a comment introducer in a number of formats. -See for example #520. +See for example [#520](https://github.com/unicode-org/message-format-wg/issues/520). The actual sigil used needs to be an ASCII character in the reserved or private use set (with syntax adjustments if we use up a private-use one). Most of the options below have been changed to use `^`, using From 118b4d6a947053d53349334d7ceb01ea8eca4298 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Sat, 11 Nov 2023 10:45:43 -0800 Subject: [PATCH 6/8] Update exploration/code-mode-introducer.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Stanisław Małolepszy --- exploration/code-mode-introducer.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index f51e131e5d..4aff317d93 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -191,7 +191,8 @@ Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start -> [!Note] Unlike the other options, Option D is presented with `#` as the sigil. +> [!NOTE] +> Unlike the other options, Option D is presented with `#` as the sigil. > That doesn't mean that this should be the sigil. > However, `#` was originally chosen for similarity in declarations to `#define` and `#import` > in various programming languages. From d2e87d8b3c82074e74ce8688dc703d2f2a267694 Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Sat, 11 Nov 2023 10:58:49 -0800 Subject: [PATCH 7/8] Update exploration/code-mode-introducer.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Stanisław Małolepszy --- exploration/code-mode-introducer.md | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index 4aff317d93..c41f52b86f 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -222,4 +222,30 @@ Cons: - Has no other purpose in the syntax - Looks like something should happen inside it - Most additional typing - + +### Option F. Preamble + +In this option, all declarations are placed in a dedicated block at the beginning of the message. +The preamble is the "front-matter" of the message, containing the message's logic. +`when` clauses are not part of the preamble. + +The preamble can be delimited with `{% ... %}`: + + {%input {$var} match {$var}%} when * {{Pattern}} + +Alternatively, it can be delimited with a new kind of delimiter, to make it visually distinct from placeholders and patterns: + + [[input {$var} match {$var}]] when * {{Pattern}} + +We could also consider dropping the `when` keywords: + + [[input {$var} match {$var}]] * {{Pattern}} + +Pros: +- Provides a clear conceptual distinction between declarations and variants. +- Visually, all code is grouped together. +- Unnests variant patterns. + +Cons: +- If `[[ ... ]]` is used to delimit the preamble, it will require `[[` to be escaped at the beginning of simple patterns. + From 4420fa1d2f5e5f51bfaafc15844bb175efaf05be Mon Sep 17 00:00:00 2001 From: Addison Phillips Date: Mon, 13 Nov 2023 11:27:49 -0800 Subject: [PATCH 8/8] Update code-mode-introducer.md --- exploration/code-mode-introducer.md | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/exploration/code-mode-introducer.md b/exploration/code-mode-introducer.md index c41f52b86f..2e09d307e5 100644 --- a/exploration/code-mode-introducer.md +++ b/exploration/code-mode-introducer.md @@ -119,6 +119,9 @@ Cons: could be a source of unintentional syntax errors - Messages commonly end with four `}}}}` +> [!NOTE] Other enclosing sequences are also an option, notably `{%...%}` (or similar). +> This does reduce the number of curly brackets in a row. + ### Option B. Use a Sigil Complex messages start with a special sigil character. @@ -166,13 +169,17 @@ Cons: ### Option D. Sigilized Keywords Instead of quoting the message, adds a sigil to keywords that -start statements, that is, `#input`, `#local` and `#match`. +start statements, that is, `.input`, `.local` and `.match`. The keyword `when` might be considered separately. +The sigil used was changed to `.` as a result of the 2023-11-13 teleconference +discussion of sigils. Others considered were `~`, `@`, `&`, and `%`. +Originally this was `#` for similarity to `#define` (etc.) in other environments. + ``` -#input {$var} -#local $foo = {$bar} -#match {$var} +.input {$var} +.local $foo = {$bar} +.match {$var} when * {{Pattern}} ``` Sample quoted pattern with no declarations or match: @@ -181,7 +188,9 @@ Sample quoted pattern with no declarations or match: ``` Pros: -- Sigil is part of the keyword, not something separate +- Sigil is part of the keyword, not something separate; note that the + need for escaping is reduced by attaching the sigil to the keyword, + since `.input` or `.local` or `.match` are unlikely to be message starters - Requires minimum additional typing - Adds no characters to messages that consist of only a quoted pattern; that is, quoting the pattern consists only of adding the `{{`/`}}` quotes @@ -191,13 +200,6 @@ Cons: - Requires an additional sigil - Requires an additional escape for simple pattern start -> [!NOTE] -> Unlike the other options, Option D is presented with `#` as the sigil. -> That doesn't mean that this should be the sigil. -> However, `#` was originally chosen for similarity in declarations to `#define` and `#import` -> in various programming languages. -> It is kept "as-is" here. - ### Option E. Special Sequence Like Option A except the sequence is closed locally (not at the end of the message).