Clarify that Reserved may also represent private-use in data model #444

eemeli · 2023-07-29T10:02:17Z

The data model Reserved interface may also be used for private-use annotations that are not supported by the implementation. This PR adds text clarifying that to be the case, and explicitly mentions that private-use syntax that is supported by the implementation may use a different interface.

The source definition is also corrected to not include the starting sigil. This was the only point in the data model where a part of the syntax was showing up twice, without a really good reason why.

spec/data-model.md

aphillips · 2023-07-29T15:17:12Z

spec/data-model.md

+A `Reserved` represents an _expression_ with a _reserved_ or _private-use_ _annotation_.
+The `sigil` corresponds to the starting sigil of the _annotation_.


I think this is incorrect.

Reserved represents a portion of the syntax that is wholly invalid--not permitted to be used--but which might become part of a future incarnation. If a part of reserved were used in a future incarnation, existing (pre the unreserving) implementations would continue with whatever behavior reserved provides. Probably this means, as you have it here, passing reserved gunk through the data model without processing.

Private use is different. Private use is a portion of the syntax that is valid, but which may not be functional in a given implementation.

What I would do is define private use and reserved separately and use the same language where appropriate.

Also: note that private-use is a feature of our specification, which is why I put it before reserved in the text.

Could you give an example of a data model use case where it would be useful for the private-use and reserved syntax rules to be represented by different data structures, rather than this single Reserved?

Given that supported private-use syntax would end up using a different interface altogether, I'm failing to come up with a realistic scenario where I'd like unsupported private-use and reserved to be handled differently. I suppose a linter error for a message including one or the other could use a slightly different text, but that can be detected from the sigil.

aphillips · 2023-07-29T15:26:32Z

spec/data-model.md


 Implementations MUST NOT rely on the set of `sigil` values remaining constant,
 as future versions of this specification MAY assign other meanings to such sigils.

 If the _expression_ includes a _literal_ or _variable_ before the _annotation_,
 it is included as the `operand`.

+When parsing the syntax of a _message_ that includes a _private-use_ _annotation_
+supported by the implementation,
+the implemenation MAY represent it in the data model using a different interface


I think we might need some stronger guidance here. I agree with the "MAY" you have, but would suggest something along the lines of:

Private-use annotations are specific to a particular "private agreement"
between the various implementations that support a given form of private-use.

When parsing the syntax of a message that includes a private-use annotation
unrecognized by the implementation, the annotation MUST be processed
identically to a reserved annotation.

An implementation that supports a given private-use annotation MUST
define the specific interface to support the semantics, structure, and meaning
that it provides. Use of existing data model interfaces is RECOMMENDED, although
an implementation MAY use any interface appropriate for its needs.

I don't think the "parsing the syntax" sentence belongs in the data model spec, but in the syntax. It's defining a MUST about what the syntax means, and if included here that would get watered down by the qualifier at the top of this doc:

Implementations are not required to use this data model for their internal representation of messages.

On review, I now also note that the Extensions section will need to be updated to account for private-use. Something like your third paragraph is probably needed there.

aphillips · 2023-07-29T15:30:25Z

spec/data-model.md

 The `source` is the "raw" value (i.e. escape sequences are not processed)
-and includes the starting `sigil`.
+and does not include the starting `sigil`.


Also, by removing the starting sigil, you remove the ability for an implementation to tell how reserved (or private) sequence was introduced. If the data model were serialized and sent to an implementation that supported one or another sigil, the receiver would have no way of knowing what the sigil was. Is there a reason you removed the sigil?

It's because it's already in the separate sigil field.

spec/data-model.md

macchiati · 2023-07-30T18:28:18Z

Implementations MUST NOT rely on the set of `sigil` values remaining

constant,

as future versions of this specification MAY assign other meanings to

such sigils. Not a desirable situation, since conformant implementations become non-conformant. What other standards (incl Unicode) have done to avoid this problem is to put symbols/ids into two buckets: 1. private-use, and 2. reserved future versions of the spec. Clients that use PU then don't have to worry about being non-conformant to a future version of the specification.

…

On Sun, Jul 30, 2023 at 9:18 AM Addison Phillips ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In spec/data-model.md <#444 (comment)> : > The `source` is the "raw" value (i.e. escape sequences are not processed) -and includes the starting `sigil`. +and does not include the starting `sigil`. My comment here isn't about sigils. It is that preserving the "raw" value without unescaping produces a risk of double-escaping (the receiver cannot tell if the escape sequence is intentional or not). What is the value of preserving the "raw" (escaped) representation of the annotation in the data model, given that the purpose of the escapes is either ambiguity in parsing MF2 or in the runtime environment? — Reply to this email directly, view it on GitHub <#444 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACJLEMC54TS5JAPMEJXYPYLXS2CNZANCNFSM6AAAAAA24M3BFA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

aphillips

Mostly editorial suggestions. Have a look.

spec/data-model/README.md

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

Clarify that Reserved may also represent private-use in data model

feae990

eemeli added the data model Issues related to the Interchange Data Model label Jul 29, 2023

aphillips requested changes Jul 29, 2023

View reviewed changes

Update spec/data-model.md

6a7317a

aphillips mentioned this pull request Jul 31, 2023

Reconsider using text for private-use and reserved #446

Closed

eemeli mentioned this pull request Aug 8, 2023

Add JSON Schema & XML DTD definitions of message data model #439

Merged

eemeli added 2 commits August 7, 2023 21:55

Merge branch 'main' into refresh-reserved

06c98cb

Rename Reserved as Unsupported in data model

d5dcc23

eemeli requested a review from aphillips August 8, 2023 13:31

aphillips requested changes Aug 8, 2023

View reviewed changes

spec/data-model/README.md Outdated Show resolved Hide resolved

spec/data-model/README.md Outdated Show resolved Hide resolved

spec/data-model/README.md Outdated Show resolved Hide resolved

spec/data-model/README.md Outdated Show resolved Hide resolved

eemeli and others added 2 commits August 8, 2023 18:16

Apply suggestions from code review

472271d

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

Apply suggestions from code review

04302be

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>

eemeli requested a review from aphillips August 8, 2023 15:20

aphillips approved these changes Aug 8, 2023

View reviewed changes

aphillips merged commit 08685d1 into unicode-org:main Aug 8, 2023

eemeli deleted the refresh-reserved branch August 8, 2023 15:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Clarify that Reserved may also represent private-use in data model #444

Clarify that Reserved may also represent private-use in data model #444

Uh oh!

eemeli commented Jul 29, 2023

Uh oh!

Uh oh!

aphillips Jul 29, 2023

Uh oh!

eemeli Jul 30, 2023

Uh oh!

aphillips Jul 29, 2023

Uh oh!

eemeli Jul 30, 2023

Uh oh!

aphillips Jul 29, 2023

Uh oh!

eemeli Jul 30, 2023

Uh oh!

Uh oh!

macchiati commented Jul 30, 2023 via email

Uh oh!

aphillips left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

		A `Reserved` represents an _expression_ with a _reserved_ or _private-use_ _annotation_.
		The `sigil` corresponds to the starting sigil of the _annotation_.

Uh oh!

Clarify that Reserved may also represent private-use in data model #444

Clarify that Reserved may also represent private-use in data model #444

Uh oh!

Conversation

eemeli commented Jul 29, 2023

Uh oh!

Uh oh!

aphillips Jul 29, 2023

Choose a reason for hiding this comment

Uh oh!

eemeli Jul 30, 2023

Choose a reason for hiding this comment

Uh oh!

aphillips Jul 29, 2023

Choose a reason for hiding this comment

Uh oh!

eemeli Jul 30, 2023

Choose a reason for hiding this comment

Uh oh!

aphillips Jul 29, 2023

Choose a reason for hiding this comment

Uh oh!

eemeli Jul 30, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

macchiati commented Jul 30, 2023 via email

Uh oh!

aphillips left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!