Skip to content

Allow colons in nmtokens #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions spec/message.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,15 @@ function = ":" name
markup-start = "+" name
markup-end = "-" name

name = name-start *name-char ; matches XML https://www.w3.org/TR/xml/#NT-Name
nmtoken = 1*name-char ; matches XML https://www.w3.org/TR/xml/#NT-Nmtokens
name = name-start *name-char ; based on https://www.w3.org/TR/xml/#NT-Name, but cannot start with U+003A COLON ":"
nmtoken = 1*name-char ; equal to https://www.w3.org/TR/xml/#NT-Nmtoken
Comment on lines +45 to +46
Copy link
Collaborator

@gibson042 gibson042 May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the colon in {:functionName} syntax sufficiently valuable to justify deviation from exact match of the XML productions? If so, I question whether alignment with XML is actually well-motivated, and if not then perhaps it would make more sense to use characters excluded from Nmtoken:

!"#$%&'()*+,/;<=>?@[\]^`{|}~

[later commentary removed because it related to Nmtoken rather than Name]

name-start = ALPHA / "_"
/ %xC0-D6 / %xD8-F6 / %xF8-2FF
/ %x370-37D / %x37F-1FFF / %x200C-200D
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF
name-char = name-start / DIGIT / "-" / "." / %xB7
/ %x0300-036F / %x203F-2040
name-char = name-start / DIGIT / "-" / "." / ":"
/ %xB7 / %x0300-036F / %x203F-2040

text-escape = backslash ( backslash / "{" / "}" )
literal-escape = backslash ( backslash / "|" )
Expand Down
13 changes: 6 additions & 7 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -464,10 +464,9 @@ Otherwise, the set of characters allowed in names is large.
The _nmtoken_ token doesn't have _name_'s restriction on the first character
and is used as variant keys and option values.

_Note:_ The Name and Nmtoken symbols are intentionally defined to be
the same as XML's [Name](https://www.w3.org/TR/xml/#NT-Name) and [Nmtoken](https://www.w3.org/TR/xml/#NT-Nmtokens)
_Note:_ _nmtoken_ is intentionally defined to be the same as XML's [Nmtoken](https://www.w3.org/TR/xml/#NT-Nmtoken)
in order to increase the interoperability with data defined in XML.
In particular, the grammatical feature data [specified in LDML](https://unicode.org/reports/tr35/tr35-general.html#Grammatical_Features)
In particular, the grammatical data [specified in LDML](https://unicode.org/reports/tr35/tr35-general.html#Grammatical_Features)
and [defined in CLDR](https://unicode-org.github.io/cldr-staging/charts/latest/grammar/index.html)
uses Nmtokens.

Expand All @@ -479,15 +478,15 @@ markup-end = "-" name
```

```abnf
name = name-start *name-char ; matches XML https://www.w3.org/TR/xml/#NT-Name
nmtoken = 1*name-char ; matches XML https://www.w3.org/TR/xml/#NT-Nmtokens
name = name-start *name-char ; based on https://www.w3.org/TR/xml/#NT-Name, but cannot start with U+003A COLON ":"
nmtoken = 1*name-char ; equal to https://www.w3.org/TR/xml/#NT-Nmtoken
name-start = ALPHA / "_"
/ %xC0-D6 / %xD8-F6 / %xF8-2FF
/ %x370-37D / %x37F-1FFF / %x200C-200D
/ %x2070-218F / %x2C00-2FEF / %x3001-D7FF
/ %xF900-FDCF / %xFDF0-FFFD / %x10000-EFFFF
name-char = name-start / DIGIT / "-" / "." / %xB7
/ %x0300-036F / %x203F-2040
name-char = name-start / DIGIT / "-" / "." / ":"
/ %xB7 / %x0300-036F / %x203F-2040
```

### Escape Sequences
Expand Down