Skip to content

Update the rest of the spec to match the ABNF after adding .keywords #548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 19 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,36 +25,32 @@ Messages can interpolate arguments formatted using _formatting functions_:

Messages can define variants which correspond to the grammatical (or other) requirements of the language:

{{
match {$count :number}
when 1 {{You have one notification.}}
when * {{You have {$count} notifications.}}
}}
.match {$count :number}
1 {{You have one notification.}}
* {{You have {$count} notifications.}}

The message syntax is also capable of expressing more complex translation, for example:

{{
local $hostName = {$host :person firstName=long}
local $guestName = {$guest :person firstName=long}
local $guestsOther = {$guestCount :number offset=1}
.local $hostName = {$host :person firstName=long}
.local $guestName = {$guest :person firstName=long}
.local $guestsOther = {$guestCount :number offset=1}

match {$host :gender} {$guestOther :number}
.match {$host :gender} {$guestOther :number}

when female 0 {{{$hostName} does not give a party.}}
when female 1 {{{$hostName} invites {$guestName} to her party.}}
when female 2 {{{$hostName} invites {$guestName} and one other person to her party.}}
when female * {{{$hostName} invites {$guestName} and {$guestsOther} other people to her party.}}
female 0 {{{$hostName} does not give a party.}}
female 1 {{{$hostName} invites {$guestName} to her party.}}
female 2 {{{$hostName} invites {$guestName} and one other person to her party.}}
female * {{{$hostName} invites {$guestName} and {$guestsOther} other people to her party.}}

when male 0 {{{$hostName} does not give a party.}}
when male 1 {{{$hostName} invites {$guestName} to his party.}}
when male 2 {{{$hostName} invites {$guestName} and one other person to his party.}}
when male * {{{$hostName} invites {$guestName} and {$guestsOther} other people to his party.}}
male 0 {{{$hostName} does not give a party.}}
male 1 {{{$hostName} invites {$guestName} to his party.}}
male 2 {{{$hostName} invites {$guestName} and one other person to his party.}}
male * {{{$hostName} invites {$guestName} and {$guestsOther} other people to his party.}}

when * 0 {{{$hostName} does not give a party.}}
when * 1 {{{$hostName} invites {$guestName} to their party.}}
when * 2 {{{$hostName} invites {$guestName} and one other person to their party.}}
when * * {{{$hostName} invites {$guestName} and {$guestsOther} other people to their party.}}
}}
* 0 {{{$hostName} does not give a party.}}
* 1 {{{$hostName} invites {$guestName} to their party.}}
* 2 {{{$hostName} invites {$guestName} and one other person to their party.}}
* * {{{$hostName} invites {$guestName} and {$guestsOther} other people to their party.}}

See more examples and the formal definition of the grammar in [spec/syntax.md](./spec/syntax.md).

Expand Down
83 changes: 63 additions & 20 deletions spec/data-model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ Two equivalent definitions of the data model are also provided:
A `SelectMessage` corresponds to a syntax message that includes _selectors_.
A message without _selectors_ and with a single _pattern_ is represented by a `PatternMessage`.

In the syntax,
a `PatternMessage` may be represented either as a _simple message_ or as a _complex message_,
depending on whether it has declarations and if its `pattern` is allowed in a _simple message_.

```ts
type Message = PatternMessage | SelectMessage;

Expand All @@ -43,15 +47,47 @@ interface SelectMessage {
```

Each message _declaration_ is represented by a `Declaration`,
which connects the `name` of the _variable_
which connects the `name` of a _variable_
with its _expression_ `value`.
The `name` does not include the initial `$` of the _variable_.

The `name` of an `InputDeclaration` MUST be the same
as the `name` in the `VariableRef` of its `VariableExpression` `value`.

An `UnsupportedStatement` represents a statement not supported by the implementation.
Its `keyword` is a non-empty string name (i.e. not including the initial `.`).
If not empty, the `body` is the "raw" value (i.e. escape sequences are not processed)
starting after the keyword and up to the first _expression_,
not including leading or trailing whitespace.
The non-empty `expressions` correspond to the trailing _expressions_ of the _reserved statement_.

> **Note**
> Be aware that future versions of this specification
> might assign meaning to _reserved statement_ values.
> This would result in new interfaces being added to
> this data model.

```ts
interface Declaration {
type Declaration = InputDeclaration | LocalDeclaration | UnsupportedStatement;

interface InputDeclaration {
type: "input";
name: string;
value: VariableExpression;
}

interface LocalDeclaration {
type: "local";
name: string;
value: Expression;
}

interface UnsupportedStatement {
type: "unsupported-statement";
keyword: string;
body?: string;
expressions: Expression[];
}
```

In a `SelectMessage`,
Expand All @@ -74,28 +110,35 @@ interface CatchallKey {
## Patterns

Each `Pattern` represents a linear sequence, without selectors.
Each element of the sequence MUST have either a `Text` or an `Expression` shape.
`Text` represents literal _text_,
Each element of the `body` array MUST either be a non-empty string or an `Expression` object.
String values represent literal _text_,
while `Expression` wraps each of the potential _expression_ shapes.
The `value` of `Text` is the "cooked" value (i.e. escape sequences are processed).
The `body` strings are the "cooked" _text_ values, i.e. escape sequences are processed.

Implementations MUST NOT rely on the set of `Expression` `body` values being exhaustive,
Implementations MUST NOT rely on the set of `Expression` interfaces being exhaustive,
as future versions of this specification MAY define additional expressions.
A `body` with an unrecognized value SHOULD be treated as an `Unsupported` value.
An `Expression` `func` with an unrecognized value SHOULD be treated as an `UnsupportedExpression` value.

```ts
interface Pattern {
body: Array<Text | Expression>;
body: Array<string | Expression>;
}

interface Text {
type: "text";
value: string;
type Expression = LiteralExpression | VariableExpression | FunctionExpression;

interface LiteralExpression {
arg: Literal;
func?: FunctionRef | UnsupportedExpression;
}
Comment on lines +127 to +132
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with the data model implied by the grammar, func fields should probably be renamed (and possibly FunctionExpression as well, in both the grammar and here).

Suggested change
type Expression = LiteralExpression | VariableExpression | FunctionExpression;
interface LiteralExpression {
arg: Literal;
func?: FunctionRef | UnsupportedExpression;
}
type Expression = LiteralExpression | VariableExpression | AnnotationExpression;
interface LiteralExpression {
arg: Literal;
annotation?: FunctionRef | UnsupportedAnnotation;
}

etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One challenge with "annotation" is that when it's in an expression without an operand, what is it "annotating"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this sounds rather tautological. I'd be happy for us to discuss this naming briefly during tomorrow's call, and to go with whatever the majority prefers (as in, spend max 5 min and either reach a conclusion or continue from there async).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you glance at the agenda you'll notice that I have a bunch of these short highly timeboxed issues for tomorrow. I've added this one.


I'll note that the current ABNF uses the term function-expression and the data model should basically match the ABNF word-for-word to the extent possible. Keeping FunctionExpression seems fine?

I do think that this might not be right, though:

  annotation?: FunctionRef | UnsupportedAnnotation;

UnsupportedAnnotation applies to reserved-annotation but not necessarily to private-use-annotation. Private use is only supported if the implementation says it is. So we need a third bucket for private use to go into so that implementations that support the PU can consume it, e.g.:

  annotation?: FunctionRef | PrivateUseAnnotation | UnsupportedAnnotation

(it is possible for a private-use to be unsupported)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the cases where the implementation does support a specific private-use annotation, we have this:

When parsing the syntax of a _message_ that includes a _private-use_ _annotation_
supported by the implementation,
the implementation SHOULD represent it in the data model
using an interface appropriate for the semantics and meaning
that the implementation attaches to that _annotation_.

And then in the Extensions section, we specify that an implementation is allowed to extend the data model for that purpose. Or at least it's intended to do so.

So we can at this point lump all of the unsupported reserved & private-use into this one basket of the data model, rather than needing to split it as in the syntax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data-model is meant to be interchanged, does that mean that the "private-use-ness" might be a relevant detail to the consumer? I mean, maybe it's FunctionRef but if it weren't, it wouldn't be UnsupportedAnnotation necessarily either? Or am I barking up the wrong tree? 🌳

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an implementation does define and support a private-use annotation, I think there are two options for its data model representation:

  1. It's mapped to some representation within the base data model, e.g. as some func value or maybe as an operand of some sort. But then why need the new syntax in the first place?
  2. The annotation is doing something truly different, and it doesn't fit within the base data model. In this case, the implementation extends the base, e.g. adding a new field in the Expression interfaces.

If going the first route, then interchange is of course easier, but that syntax will also have some representation without private use. If going the second route, interchange will require both ends to agree about the private-use.


interface VariableExpression {
arg: VariableRef;
func?: FunctionRef | UnsupportedExpression;
}

interface Expression {
type: "expression";
body: Literal | VariableRef | FunctionRef | Unsupported;
interface FunctionExpression {
arg?: never;
func: FunctionRef | UnsupportedExpression;
}
```

Expand Down Expand Up @@ -148,31 +191,31 @@ interface Option {
}
```

An `Unsupported` represents an _expression_ with a
_reserved_ _annotation_ or a _private-use_ _annotation_ not supported
An `UnsupportedExpression` represents an _expression_ with a
_reserved annotation_ or a _private-use annotation_ not supported
by the implementation.
The `sigil` corresponds to the starting sigil of the _annotation_.
The `source` is the "raw" value (i.e. escape sequences are not processed)
and does not include the starting `sigil`.

> **Note**
> Be aware that future versions of this specification
> might assign meaning to _reserved_ `sigil` values.
> might assign meaning to _reserved annotation_ `sigil` values.
> This would result in new interfaces being added to
> this data model.

If the _expression_ includes a _literal_ or _variable_ before the _annotation_,
it is included as the `operand`.

When parsing the syntax of a _message_ that includes a _private-use_ _annotation_
When parsing the syntax of a _message_ that includes a _private-use annotation_
supported by the implementation,
the implementation SHOULD represent it in the data model
using an interface appropriate for the semantics and meaning
that the implementation attaches to that _annotation_.

```ts
interface Unsupported {
type: "unsupported";
interface UnsupportedExpression {
type: "unsupported-expression";
sigil: "!" | "@" | "#" | "%" | "^" | "&" | "*" | "<" | ">" | "/" | "?" | "~";
source: string;
operand?: Literal | VariableRef;
Expand Down
31 changes: 23 additions & 8 deletions spec/data-model/message.dtd
Original file line number Diff line number Diff line change
@@ -1,31 +1,46 @@
<!ELEMENT message (declaration*,(pattern|(selectors,variant+)))>
<!ELEMENT message (
(declaration | unsupportedStatement)*,
(pattern | (selectors,variant+))
)>

<!-- In a <declaration type="input">, the <expression> MUST contain a <variable> -->
<!ELEMENT declaration (expression)>
<!ATTLIST declaration name NMTOKEN #REQUIRED>
<!ATTLIST declaration
type (input | local) #REQUIRED
name NMTOKEN #REQUIRED
>

<!ELEMENT unsupportedStatement (expression)+>
<!ATTLIST unsupportedStatement
keyword CDATA #REQUIRED
body CDATA #IMPLIED
>

<!ELEMENT selectors (expression)+>
<!ELEMENT variant (key+,pattern)>
<!ELEMENT key (#PCDATA)>
<!ATTLIST key default (true | false) "false">

<!ELEMENT pattern (#PCDATA | expression)*>
<!ELEMENT expression (literal | variable | function | unsupported)>

<!ELEMENT expression (
((literal | variable), (function | unsupportedExpression)?) |
function | unsupportedExpression
)>

<!ELEMENT literal (#PCDATA)>
<!ATTLIST literal quoted (true | false) #REQUIRED>

<!ELEMENT variable (EMPTY)>
<!ATTLIST variable name NMTOKEN #REQUIRED>

<!ELEMENT function (operand?,option*)>
<!ELEMENT function (option)*>
<!ATTLIST function
kind (open | close | value) #REQUIRED
name NMTOKEN #REQUIRED
>
<!ELEMENT operand (literal | variable)>
<!ELEMENT option (literal | variable)>
<!ATTLIST option name NMTOKEN #REQUIRED>

<!ELEMENT unsupported (operand?,source)>
<!ATTLIST unsupported sigil CDATA #REQUIRED>
<!ELEMENT source (#PCDATA)>
<!ELEMENT unsupportedExpression (#PCDATA)>
<!ATTLIST unsupportedExpression sigil CDATA #REQUIRED>
Loading