Skip to content

Allow colons in nmtokens #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 8, 2023
Merged

Conversation

stasm
Copy link
Collaborator

@stasm stasm commented Mar 12, 2023

The current definition of the nmtoken production doesn't match the XML's one. The same is true of name, but names are less of an issue. nmtokens are frequently used in LDML to define attribute values which are likely to be used a variant keys and option values.

This was originally spotted by @mihnita in #364 (comment). I suggested fixing it in one of the two following manners:

  • by adding : only to name-char (and thus, to nmtoken), and accepting that name is not the same as XML's Name but at least nmtoken is aligned, or
  • by adding : to name-start, aligning both name and nmtoken with their counterparts in XML. This however could result in weirdness around variable ($:foo) and function (::foo) names...

This PR implements the first solution.

The current definition of the `nmtoken` production doesn't match the XML's one. The same is true of `name`, but names are less of an issue. `nmtokens` are frequently used in LDML to define attribute values which are likely to be used a variant keys and option values.
@stasm
Copy link
Collaborator Author

stasm commented Mar 12, 2023

It's worth noting that even without this change, it's still possible to use literals for values which don't match our nmtoken production. Also, AFAICT, LDML doesn't actually define any exotic attribute values right now. However, aligning with XML exactly means that any valid tokenized XML attribute value can be represented without resorting to the literal syntax.

@stasm stasm requested review from aphillips and mihnita March 14, 2023 15:09
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. We can revisit later if needed.

@aphillips aphillips merged commit c0873f7 into unicode-org:main May 8, 2023
Comment on lines +45 to +46
name = name-start *name-char ; based on https://www.w3.org/TR/xml/#NT-Name, but cannot start with U+003A COLON ":"
nmtoken = 1*name-char ; equal to https://www.w3.org/TR/xml/#NT-Nmtoken
Copy link
Collaborator

@gibson042 gibson042 May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the colon in {:functionName} syntax sufficiently valuable to justify deviation from exact match of the XML productions? If so, I question whether alignment with XML is actually well-motivated, and if not then perhaps it would make more sense to use characters excluded from Nmtoken:

!"#$%&'()*+,/;<=>?@[\]^`{|}~

[later commentary removed because it related to Nmtoken rather than Name]

eemeli added a commit to messageformat/messageformat that referenced this pull request Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants