-
-
Notifications
You must be signed in to change notification settings - Fork 36
Clarify that Reserved may also represent private-use in data model #444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
spec/data-model.md
Outdated
A `Reserved` represents an _expression_ with a _reserved_ or _private-use_ _annotation_. | ||
The `sigil` corresponds to the starting sigil of the _annotation_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect.
Reserved represents a portion of the syntax that is wholly invalid--not permitted to be used--but which might become part of a future incarnation. If a part of reserved were used in a future incarnation, existing (pre the unreserving) implementations would continue with whatever behavior reserved provides. Probably this means, as you have it here, passing reserved gunk through the data model without processing.
Private use is different. Private use is a portion of the syntax that is valid, but which may not be functional in a given implementation.
What I would do is define private use and reserved separately and use the same language where appropriate.
Also: note that private-use is a feature of our specification, which is why I put it before reserved in the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an example of a data model use case where it would be useful for the private-use and reserved syntax rules to be represented by different data structures, rather than this single Reserved
?
Given that supported private-use syntax would end up using a different interface altogether, I'm failing to come up with a realistic scenario where I'd like unsupported private-use and reserved to be handled differently. I suppose a linter error for a message including one or the other could use a slightly different text, but that can be detected from the sigil
.
spec/data-model.md
Outdated
|
||
Implementations MUST NOT rely on the set of `sigil` values remaining constant, | ||
as future versions of this specification MAY assign other meanings to such sigils. | ||
|
||
If the _expression_ includes a _literal_ or _variable_ before the _annotation_, | ||
it is included as the `operand`. | ||
|
||
When parsing the syntax of a _message_ that includes a _private-use_ _annotation_ | ||
supported by the implementation, | ||
the implemenation MAY represent it in the data model using a different interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need some stronger guidance here. I agree with the "MAY" you have, but would suggest something along the lines of:
Private-use annotations are specific to a particular "private agreement"
between the various implementations that support a given form of private-use.When parsing the syntax of a message that includes a private-use annotation
unrecognized by the implementation, the annotation MUST be processed
identically to a reserved annotation.An implementation that supports a given private-use annotation MUST
define the specific interface to support the semantics, structure, and meaning
that it provides. Use of existing data model interfaces is RECOMMENDED, although
an implementation MAY use any interface appropriate for its needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the "parsing the syntax" sentence belongs in the data model spec, but in the syntax. It's defining a MUST about what the syntax means, and if included here that would get watered down by the qualifier at the top of this doc:
Implementations are not required to use this data model for their internal representation of messages.
On review, I now also note that the Extensions section will need to be updated to account for private-use. Something like your third paragraph is probably needed there.
spec/data-model.md
Outdated
The `source` is the "raw" value (i.e. escape sequences are not processed) | ||
and includes the starting `sigil`. | ||
and does not include the starting `sigil`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, by removing the starting sigil, you remove the ability for an implementation to tell how reserved (or private) sequence was introduced. If the data model were serialized and sent to an implementation that supported one or another sigil, the receiver would have no way of knowing what the sigil was. Is there a reason you removed the sigil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because it's already in the separate sigil
field.
Implementations MUST NOT rely on the set of `sigil` values remaining
constant,
as future versions of this specification MAY assign other meanings to
such sigils.
Not a desirable situation, since conformant implementations become
non-conformant. What other standards (incl Unicode) have done to avoid
this problem is to put symbols/ids into two buckets:
1. private-use, and
2. reserved future versions of the spec.
Clients that use PU then don't have to worry about being non-conformant to
a future version of the specification.
…On Sun, Jul 30, 2023 at 9:18 AM Addison Phillips ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In spec/data-model.md
<#444 (comment)>
:
> The `source` is the "raw" value (i.e. escape sequences are not processed)
-and includes the starting `sigil`.
+and does not include the starting `sigil`.
My comment here isn't about sigils. It is that preserving the "raw" value
without unescaping produces a risk of double-escaping (the receiver cannot
tell if the escape sequence is intentional or not).
What is the value of preserving the "raw" (escaped) representation of the
annotation in the data model, given that the purpose of the escapes is
either ambiguity in parsing MF2 or in the runtime environment?
—
Reply to this email directly, view it on GitHub
<#444 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMC54TS5JAPMEJXYPYLXS2CNZANCNFSM6AAAAAA24M3BFA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly editorial suggestions. Have a look.
Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
The data model
Reserved
interface may also be used for private-use annotations that are not supported by the implementation. This PR adds text clarifying that to be the case, and explicitly mentions that private-use syntax that is supported by the implementation may use a different interface.The
source
definition is also corrected to not include the starting sigil. This was the only point in the data model where a part of the syntax was showing up twice, without a really good reason why.