Skip to content

Add Literal Resolution section to formatting.md #382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions spec/formatting.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,16 @@ when formatting a message for display in a user interface, or for some later pro
The document is part of the MessageFormat 2.0 specification,
the successor to ICU MessageFormat, henceforth called ICU MessageFormat 1.0.

## Literal Resolution

The resolved value of _text_, _literal_ and _nmtoken_ tokens
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to include unquoted if/once #364 is accepted.

Copy link
Collaborator

@catamorphism catamorphism May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "value" seems to be getting used in multiple ways in this paragraph.

The first sentence refers to "the resolved value of text, literal and nmtoken tokens"; if resolution is the relation that maps character strings in the language defined by the ABNF onto values, then I understand "value" as being used semantically here. The spec doesn't (yet?) define criteria for membership in this set of semantic values, not in the precise way that the ABNF defines membership in the set of syntactically valid messages.

However, the next sentence refers to an "option value", which I take as being a syntactic concept: the token that appears on the right-hand side of the '=' in the option nonterminal into the ABNF.

Defining "value" and "resolution" before these terms are used, and replacing "or option value" with "on the right-hand side of an option", might help clarify things. (This could be done in the glossary, which uses "value" many times without defining it (possibly not always with the same meaning), and doesn't define "resolution", and cross-referenced here; could be in a future PR.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intended meaning of "resolved value" here is the value that will ultimately get formatted. So for an unquoted literal 42, it would be the string '42', while for a quoted literal |foo\|bar|, it would be the string 'foo|bar'. For a variable reference $foo, it would be the value of the variable, which could really be anything.

My intent would be to explain this term as a part of the bigger formatting PR I'm now working on.

Renaming "option value" does sound like a good idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as "resolved"/"resolution" are defined in the bigger PR, I'm fine with leaving those terms undefined in this one.

is always a string concatenation of its parts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the "parts" here? I would think these items (text, literal and nmtoken) are part-less?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to refer to the *-char and *-escape parts of text and literal, and name-char for nmtoken, as hinted by the rest of this sentence.

with escape sequences resolving to their escaped characters.
When a _literal_ or _nmtoken_ is used as an _expression_ argument
or on the right-hand side of an _option_,
the formatting function MUST treat their resolved values the same independently of their presentation,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth re-wording this to clarify that the contract between the caller of the formatting function, and the callee (formatting function), makes it impossible to do otherwise. (See #382 (comment) ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily impossible in all implementations. A formatting function needs to be passed some amount of contextual information, such as the current locale, and it's possible to consider an implementation that also includes in that context something like an AST of the current expression. This might make sense for instance in order to enable errors in specific options to be positioned exactly in terms of source offsets.

This statement is specifying that even in such a hypothetical situation, a valid formatting function is not allowed to vary its behaviour based on the quoting style of the literal value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do errors count as "behavior"? It sounds like you're saying the error might be different based on the AST of the current expression, which suggests not treating resolved values the same independently of their presentation (to me, a different error is different behavior).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that an implementation may exist which, for reasonable reasons, does enable for a pathway to exist by which a formatting function could determine whether an option value was originally quoted or not.

For errors, I think the current spec shape of specifying the type of error is appropriate.

such that e.g. the options `foo=42` and `foo=|42|` have the same effect.

## Variable Resolution

To resolve the value of a Variable,
Expand Down