Skip to content

Implementing namespacing #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Nov 29, 2023
Merged

Implementing namespacing #524

merged 39 commits into from
Nov 29, 2023

Conversation

aphillips
Copy link
Member

@aphillips aphillips commented Nov 13, 2023

Initial implementation of namespacing design into the spec.

This version makes the following pair of options parallel to one another (i.e. they are not duplicates):

{$foo :function option=value ns:option=value2}

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first pass, with some comments. Also, the list numbers are all messed up.

aphillips and others added 5 commits November 14, 2023 10:44
Co-authored-by: Eemeli Aro <eemeli@mozilla.com>
From a suggestion by @eemeli but accounting for the use of `name-part` in `variable`
@aphillips aphillips requested a review from eemeli November 14, 2023 19:50
spec/syntax.md Outdated
Otherwise, the set of characters allowed in names is large.

_Functions_ and _options_ MAY be preceded by a _namespace_ identifier
which is separated from the body of the _name_ by a U+003A COLON `:`.
Built-in _functions_ and _options_ do not have a _namespace_ identifier.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if an implementation extends a built-in function with additional options -- could those implementation-specific options be namespaced?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. In fact, we show this in the examples.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be clearer for this sentence to say something like "Built-in functions and standard options to built-in functions..."

I'm not sure if "standard" is the right word there, but I mean the options specified in the standard registry (if it's correct to call that the standard registry).

aphillips and others added 2 commits November 14, 2023 16:22
Co-authored-by: Tim Chevalier <tjc@igalia.com>
Co-authored-by: Tim Chevalier <tjc@igalia.com>
Comment on lines 255 to 257
- If _name_ contains a colon, let _namespace_ be the
substring of _name_ up to (but not including)
the colon and _name-part_ be the substring of _name_ following the colon;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if name contains more than one colon?

Suggested change
- If _name_ contains a colon, let _namespace_ be the
substring of _name_ up to (but not including)
the colon and _name-part_ be the substring of _name_ following the colon;
- If _name_ contains a colon, let _namespace_ be the
substring of _name_ up to (but not including)
the first colon and _name-part_ be the substring of _name_ following the first colon;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the ability of colon to appear in name. Note that this makes our names match NCName (which I cite in this revision). This helps reduce the "sorta like an" XML name thing.

Comment on lines 74 to 78
name = [namespace ":"] name-body
namespace = name-part
name-part = name-start *name-char
name-start = ALPHA / "_"
/ %xC0-D6 / %xD8-F6 / %xF8-2FF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no definition for name-body, but this grammar is ambiguous if that is intended to be name-part—which ends in *name-char and therefore can itself include colons. For example, would parsing "foo:bar:baz" as a name be { namespace: "foo", name-part: "bar:baz" } or { namespace: "foo:bar", name-part: "baz" } or { name-part: "foo:bar:baz" }?

Copy link
Collaborator

@stasm stasm Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need to fix the ambiguity here. I think #519 is the right place to do it, so that we have a chance to holistically look at the syntax and consider other productions. We don't need to block this PR.

My preference, however, would be to make this PR complete, i.e. fix the ambiguity somehow before merging. Two options for temporary solutions:

  • Use / as a temporary namespace separator.
  • Make namespace consist only of name-start characters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the changes to syntax.md mention NCName. @aphillips do you intend to factor NCName into message.abnf in this PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of sanity, the namespace separator character should not be a valid name-part character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed colon (which I had intended in the first place 🤦🤷‍♂️🙈). As noted above, this makes us match XML names.

aphillips and others added 2 commits November 14, 2023 16:36
Co-authored-by: Tim Chevalier <tjc@igalia.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>

Function access to the _formatting context_ MUST be minimal and read-only,
and execution time SHOULD be limited.
Implementation-defined _functions_ SHOULD use an implementation-defined _namespace_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "implementation-defined" include the implementations of the default registry's functions? Or do you mean other functions built into the implementation, but not present in the default registry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of going back to the design document and adding some use cases to illustrate these. In this case, I mean "non-default registry functions". I will add text to clarify.


Suppose you're Mihai and you're creating the ICU4J implementation. The :datetime function in the default registry might be backed by com.ibm.icu.text.DateFormat as an implementing class. There's no need to prefix :datetime for that. That's just implementation.

Mihai might want to expose skeleton though. That's not (currently) a default registry option. So he'd prefix it as icu:skeleton in his implementation.

Suppose he then goes on to implement com.ibm.icu.text.DateIntervalFormat as a function. That's not in the default registry, so it gets :icu:dateinterval and its options might or might not be namespaced.

Suppose Mihai goes on to implement an SPI for custom functions. Suppose I write my own date formatter. It doesn't replace the one in :dateformat, so it gets a prefix like :app:dateformat. The standard options might not be prefixed, but my custom options would be. ICU's options would probably be unrecognized to me. So you could see expressions like:

{$now :datetime icu:skeleton=yMMMd}
{$now :icu:dateinterval end=$then skeleton=yMMM}
{$now :app:datetime dateStyle=short :app:wonderfulness=high}

Suppose you're a tool implementer working with messages written for Mihai's implementation with my add-on. When you see a namespace prefix, that tells you that the function or the option is non-standard. The prefix might tell you where to look for the add-on registry or at least be something that you can key locally. Tool consumption is a key reason for namespacing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these use-cases in form of such user journeys sounds great. I'll be happy to contribute. Open a new PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to add a subsection to "Function Resolution" that collects these directions about what's possible and recommended for implementations and users to do with functions?

Suggested change
Implementation-defined _functions_ SHOULD use an implementation-defined _namespace_.

@aphillips
Copy link
Member Author

@stasm noted:

This is why functions should be able to see the namespaces of all their options.

I think implementation should make it possible for functions to see the namespaces. More specifically, there is probably glue code between the actual function and MessageFormat. That code can find out what its namespace is and select options based on that.

Is the argument here that I should change the algorithm to pack up the namespace on the options? We'd lose the duplicate checking between one:foo and two:foo (and foo) and make it the problem of the implementation to solve?

This makes `name` equivalent to xml-names's NCName (just like the spec text says)
@stasm
Copy link
Collaborator

stasm commented Nov 15, 2023

Is the argument here that I should change the algorithm to pack up the namespace on the options? We'd lose the duplicate checking between one:foo and two:foo (and foo) and make it the problem of the implementation to solve?

I think so, yes. foo and one:foo should be different at the moment of the function's invocation, just in case one:foo comes first time-wise, and then foo is added to the upstream implementation of the function. I think generalizing this to all options means that one:foo and two:foo should also be different.

aphillips and others added 2 commits November 27, 2023 09:18
Co-authored-by: Eemeli Aro <eemeli@mozilla.com>
@aphillips
Copy link
Member Author

Per teleconference, we're ready to merge this---except...

We would like unquoted to be able to start with - (as in -40). Currently this can't happen because the ABNF is still using - as a sigil for spannables in function. What I'd like to do is:

function = (":" / "#" / "/") identifier
;...
unquoted-start = name-start / DIGIT / "-" / "."

But we need to agree to changing the function sigils in order to do that (or at least removing the - sigil)

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's all groovy now.

@aphillips
Copy link
Member Author

@gibson042 ... over to you 😉

Copy link
Collaborator

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I object to the optionality of an Unknown Function error for a function with unrecognized namespace or a function that does not exist in the claimed namespace. Everything else looks good, modulo a few editorial suggestions.

aphillips and others added 4 commits November 28, 2023 12:25
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Co-authored-by: Eemeli Aro <eemeli@mozilla.com>
Copy link
Collaborator

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as "Request changes" just to make sure this doesn't get merged without #524 (comment) being resolved; otherwise this is still fine.

Co-authored-by: Eemeli Aro <eemeli@mozilla.com>
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

Copy link
Collaborator

@stasm stasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Do we want namespaces to be a separate concept in the data model? Or should the data model just store the stringified identifier, i.e. ns:name?

@eemeli
Copy link
Collaborator

eemeli commented Nov 29, 2023

Do we want namespaces to be a separate concept in the data model? Or should the data model just store the stringified identifier, i.e. ns:name?

I would prefer not adding namespaces to the data model, and continuing there with a single-string name.

@aphillips
Copy link
Member Author

Per previous agreement, merging this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants