Skip to content

Use " or ' instead of | as quote delimiter #414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ In a _pattern_, the resolved value of an _expression_ is used in its _formatting
The shapes of resolved values are implementation-dependent,
and different implementations MAY choose to perform different levels of resolution.

> For example, the resolved value of the _expression_ `{|0.40| :number style=percent}`
> For example, the resolved value of the _expression_ `{0.40 :number style=percent}`
> could be an object such as
>
> ```
Expand Down Expand Up @@ -114,7 +114,7 @@ the formatting function MUST treat its resolved value the same
whether its value was originally _quoted_ or _unquoted_.

> For example,
> the _option_ `foo=42` and the _option_ `foo=|42|` are treated as identical.
> the _option_ `foo=42` and the _option_ `foo='42'` are treated as identical.

The resolution of a _text_ or _literal_ token MUST always succeed.

Expand Down Expand Up @@ -184,12 +184,12 @@ In each such case, an error MUST be emitted
and a **_fallback value_** used for the _expression_.
This value depends on the shape of the _expression_:

- _expression_ with _literal_ _operand_: U+007C VERTICAL LINE `|`
- _expression_ with _literal_ _operand_: U+0022 QUOTATION MARK `"`
followed by the value of the Literal,
and then by U+007C VERTICAL LINE `|`.
and then by U+0022 QUOTATION MARK `"`.
The same representation is used for both _quoted_ and _unquoted_ values.

> Examples: `|your horse|`, `|42|`
> Examples: `"your horse"`, `"42"`

- _expression_ with _variable_ _operand_: U+0024 DOLLAR SIGN `$`
followed by the _variable_ _name_ of the _operand_
Expand All @@ -216,17 +216,17 @@ rather than the _expression_ in the _selector_ or _pattern_.
> attempting to format either of the following messages:
>
> ```
> let $var = {|horse| :func}
> let $var = {"horse" :func}
> {The value is {$var}.}
> ```
>
> ```
> let $var = {|horse|}
> let $var = {'horse'}
> {The value is {$var :func}.}
> ```
>
> would in both cases result in the _pattern_ _expression_
> resolving to a _fallback value_ of `|horse|`.
> resolving to a _fallback value_ of `"horse"`.

_Pattern selection_ is not supported for _fallback values_.

Expand Down Expand Up @@ -629,7 +629,7 @@ These are divided into the following categories:
> ```
>
> ```
> let $var = {|no message body|}
> let $var = {"no message body"}
> ```

- **Data Model errors** occur when a message is invalid due to
Expand Down Expand Up @@ -716,7 +716,7 @@ These are divided into the following categories:
> ```
>
> ```
> match {|horse| :func}
> match {'horse' :func}
> when 1 {The value is one.}
> when * {The value is not one.}
> ```
Expand All @@ -730,13 +730,13 @@ These are divided into the following categories:
> uses a `:plural` selector function which requires its input to be numeric:
>
> ```
> match {|horse| :plural}
> match {'horse' :plural}
> when 1 {The value is one.}
> when * {The value is not one.}
> ```
>
> ```
> let $sel = {|horse| :plural}
> let $sel = {'horse' :plural}
> match {$sel}
> when 1 {The value is one.}
> when * {The value is not one.}
Expand Down
23 changes: 14 additions & 9 deletions spec/message.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,12 @@ text-char = %x0-5B ; omit \
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

quoted = "|" *(quoted-char / quoted-escape) "|"
quoted-char = %x0-5B ; omit \
/ %x5D-7B ; omit |
/ %x7D-D7FF ; omit surrogates
quoted = "'" *(quoted-char / DQUOTE / quoted-escape) "'"
/ DQUOTE *(quoted-char / "'" / quoted-escape) DQUOTE
quoted-char = %x0-21 ; omit "
/ %x23-26 ; omit '
/ %x28-5B ; omit \
/ %x5D-D7FF ; omit surrogates
/ %xE000-10FFFF

; based on https://www.w3.org/TR/xml/#NT-Nmtoken,
Expand All @@ -47,12 +49,15 @@ unquoted-start = name-start / DIGIT / "."
reserved = ( reserved-start / private-start ) reserved-body
reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~"
private-start = "^" / "&"
reserved-body = *( [s] 1*(reserved-char / reserved-escape / literal))
reserved-body = *( [s] 1*(reserved-char / reserved-escape / quoted))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unrelated fix that I noticed while working on this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a separate PR for this.

The intention was to allow reserved to define whatever structure it wants. For example, allowing "comments" might look something like:

{% This is a "comment". Do not translate it}
{$foo % This is a "comment" about $foo: do not translate it}

The contents of reserved must appear within a placeholder, so the characters { and } must be escaped--but everything else (including quotes, whichever ones we use) remain undefined.

Hence reserved-char should match text-char. The production reserved-char currently omits whitespace, but the optional whitespace part of reserved-body just adds the spaces back.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #415.

reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
/ %x21-5B ; omit \
/ %x5D-7A ; omit { | }
/ %x21 ; omit "
/ %x23-26 ; omit '
/ %x28-5B ; omit \
/ %x5D-7A ; omit {
/ %x7C ; omit }
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

Expand All @@ -68,8 +73,8 @@ name-char = name-start / DIGIT / "-" / "." / ":"
/ %xB7 / %x300-36F / %x203F-2040

text-escape = backslash ( backslash / "{" / "}" )
quoted-escape = backslash ( backslash / "|" )
reserved-escape = backslash ( backslash / "{" / "|" / "}" )
quoted-escape = backslash ( backslash / "'" / DQUOTE )
reserve-escape = backslash ( backslash / "{" / "'" / DQUOTE / "}" )
backslash = %x5C ; U+005C REVERSE SOLIDUS "\"

s = 1*( SP / HTAB / CR / LF )
41 changes: 27 additions & 14 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,19 +373,19 @@ option = name [s] "=" [s] (literal / variable)
> ```
>
> ```
> {|-1.23|}
> {'-1.23'}
> ```
>
> ```
> {1.23 :number maxFractionDigits=1}
> ```
>
> ```
> {|Thu Jan 01 1970 14:37:00 GMT+0100 (CET)| :datetime weekday=long}
> {"Thu Jan 01 1970 14:37:00 GMT+0100 (CET)" :datetime weekday=long}
> ```
>
> ```
> {|My Brand Name| :linkify href=|foobar.com|}
> {'My Brand Name' :linkify href='foobar.com'}
> ```
>
> ```
Expand Down Expand Up @@ -436,12 +436,15 @@ unrecognized reserved sequences have no meaning and MAY result in errors during
reserved = ( reserved-start / private-start ) reserved-body
reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~"
private-start = "^" / "&"
reserved-body = *( [s] 1*(reserved-char / reserved-escape / literal))
reserved-body = *( [s] 1*(reserved-char / reserved-escape / quoted))
reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
/ %x21-5B ; omit \
/ %x5D-7A ; omit { | }
/ %x21 ; omit "
/ %x23-26 ; omit '
/ %x28-5B ; omit \
/ %x5D-7A ; omit {
/ %x7C ; omit }
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF
```
Expand Down Expand Up @@ -485,7 +488,15 @@ text-char = %x0-5B ; omit \

**_Quoted_** literals may include content with any Unicode code point,
except for surrogate code points U+D800 through U+DFFF.
The characters `\` and `|` MUST be escaped as `\\` and `\|`.
They are delimited by a matched pair of
either U+0022 QUOTATION MARK `"` or U+0027 APOSTROPHE `'` characters.
This choice is intended to make _message_ values easier to embed in other formats,
and to avoid needing to escape delimiter characters in many cases.

Within a _quoted_ value, the character `\` MUST be escaped as `\\`,
and the characters `'` and `"` MUST be escaped as `\'` and `\"`
if they match the delimiters of the _quoted_ value.
The characters `'` and `"` MAY be escaped if they do not match the delimiters.

**_Unquoted_** literals have a much more restricted range that
is intentionally close to the XML's [Nmtoken](https://www.w3.org/TR/xml/#NT-Nmtoken),
Expand All @@ -497,11 +508,13 @@ All code points are preserved.
```abnf
literal = quoted / unquoted

quoted = "|" *(quoted-char / quoted-escape) "|"
quoted-char = %x0-5B ; omit \
/ %x5D-7B ; omit |
/ %x7D-D7FF ; omit surrogates
/ %xE000-10FFFF
quoted = "'" *(quoted-char / DQUOTE / quoted-escape) "'"
/ DQUOTE *(quoted-char / "'" / quoted-escape) DQUOTE
quoted-char = %x0-21 ; omit "
/ %x23-26 ; omit '
/ %x28-5B ; omit \
/ %x5D-D7FF ; omit surrogates
/ %xE000-10FFFF

unquoted = unquoted-start *name-char
unquoted-start = name-start / DIGIT / "."
Expand Down Expand Up @@ -540,8 +553,8 @@ in the body of `text`, `quoted`, or `reserved` sequences respectively:

```abnf
text-escape = backslash ( backslash / "{" / "}" )
quoted-escape = backslash ( backslash / "|" )
reserve-escape = backslash ( backslash / "{" / "|" / "}" )
quoted-escape = backslash ( backslash / "'" / DQUOTE )
reserve-escape = backslash ( backslash / "{" / "'" / DQUOTE / "}" )
backslash = %x5C ; U+005C REVERSE SOLIDUS "\"
```

Expand Down