Skip to content

Implement code more introducer #529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 1, 2023
66 changes: 38 additions & 28 deletions spec/message.abnf
Original file line number Diff line number Diff line change
@@ -1,43 +1,46 @@
message = pattern / complex-message
message = simple-message / complex-message

complex-message = "{{" [s] *(declaration [s]) body [s] "}}"
simple-message = [simple-start pattern]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the [/] is super-subtle and may be worth a comment. These are there to make the empty string a valid message.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, let's make sure this is called out in syntax.md.

simple-start = simple-start-char / text-escape / expression
pattern = *(text-char / text-escape / expression)

declaration = input-declaration / local-declaration
complex-message = *(declaration [s]) complex-body
declaration = input-declaration / local-declaration / reserved-statement
input-declaration = input [s] variable-expression
local-declaration = local s variable [s] "=" [s] expression

body = quoted-pattern
/ (selectors 1*([s] variant))

complex-body = quoted-pattern
/ ((selectors / reserved-statement) 1*([s] variant))
Comment on lines +12 to +13
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may prove valuable to establish a greater separation between selectors and reserved-statement here.

Suggested change
complex-body = quoted-pattern
/ ((selectors / reserved-statement) 1*([s] variant))
complex-body = quoted-pattern
/ (selectors 1*([s] variant))
/ (reserved-statement 1*([s] variant))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you say why? The two expressions are wholly synonymous, and I don't really see why we should prefer one over the other.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I expect the latter to diverge—I don't think we should treat .match {…}… and .<reserved> … as interchangeable prefixes before a list of variants, but rather .match {…}… <variant>… as one thing complete in itself and .<reserved> … as something independent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motiviation for putting reserved-statement inside complex-body? Do we expect other multivariant constructs than match? I think this may be building too much flexibility into the spec. Could we instead agree that any future keywords would go with other declarations?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I now see that @gibson042 mentions this as one of the issues to work on as follow-ups in his review. How about we land this PR without reserved-statement in complex-body and then consider adding it, rather than landing it with doubts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I would find it easier to include it, show what impact it has across the stack, and then evaluate whether to drop it.

We have already agreed on having two follow up PRs to this that will both re-tread this ground. Could we agree that this will be considered more in both of those, and accept having it in the ABNF for now?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, I would find it easier to include it, show what impact it has across the stack, and then evaluate whether to drop it.

We should do things exactly the other way.

I also don't appreciate pushing hard for merging. We should merge when things look good to merge.

quoted-pattern = "{{" pattern "}}"
pattern = *(text / expression)

selectors = match 1*([s] expression)
variant = when 1*(s key) [s] quoted-pattern
variant = key *(s key) [s] quoted-pattern
key = literal / "*"

expression = literal-expression / variable-expression / function-expression
literal-expression = "{" [s] literal [s annotation] [s] "}"
variable-expression = "{" [s] variable [s annotation] [s] "}"
function-expression = "{" [s] annotation [s] "}"
annotation = (function *(s option)) / reserved / private-use
annotation = (function *(s option))
/ reserved-annotation
/ private-use-annotation

literal = quoted / unquoted
variable = "$" name
function = (":" / "+" / "-") identifier
option = identifier [s] "=" [s] (literal / variable)

; reserved keywords are always lowercase
input = %s"input"
local = %s"local"
match = %s"match"
when = %s"when"
input = %s".input"
local = %s".local"
match = %s".match"

text = 1*(text-char / text-escape)
text-char = %x0-5B ; omit \
/ %x5D-7A ; omit {
/ %x7C ; omit }
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF
simple-start-char = %x0-2D ; omit .
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it looks like we don't append -char to other -start productions in our grammar.

Suggested change
simple-start-char = %x0-2D ; omit .
simple-start = %x0-2D ; omit .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there already is another rule that's called simple-start. If necessary, could we take care of any such editorial rule renames in a separate follow-up PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I'll open a new PR for this.

/ %x2F-5B ; omit \
/ %x5D-7A ; omit {
/ %x7C ; omit }
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF
text-char = simple-start-char / "."

quoted = "|" *(quoted-char / quoted-escape) "|"
quoted-char = %x0-5B ; omit \
Expand All @@ -49,16 +52,19 @@ unquoted = unquoted-start *(name-char / ":")
unquoted-start = name-start / DIGIT / "."
/ %xB7 / %x300-36F / %x203F-2040

; reserve sigils for private-use by implementations
private-use = private-start reserved-body
private-start = "^" / "&"
; Reserve additional .keywords for use by future versions of this specification.
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replying to @gibson042 from #529 (comment):

Please correct me if I misremember, but my understanding from today is that we're fine with reserving only statements that end in an expression?

That is not my recollection, and would preclude possibilities like .strict true. However, I'm fine with addressing it in a followup.

Could we merge this PR with this note in the ABNF, in expectation of getting a PR from @gibson042 that could update all reserved content holistically?

Suggested change
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression)
; NOTE: Rules for reserved statements and annotations
; should be considered provisional, pending a near-future PR updating them.
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression)

ps. I had not considered statements like .strict true. That does sound like something we ought to reserve space for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason statements end with expression is that the closing } of the expression is how we know the statement ended. Otherwise we're in reserved-body... forever. It is a hidden detail of our syntax that statements end with a close bracket.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that's true with the current state of this PR, I believe @gibson042 would like us to drop . as a valid character in reserved-body, which would allow determining its end by encountering it at the start of the next keyword.

My request here is for us to find a way to not block this PR on that discussion, and to have it instead separately so that the syntax of both reserved statements and annotations could be considered in one package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with finalizing this PR. I don't think the reserved-statement discussion (were we to change something) needs to be in the way of that. It should be a separate conversation. If we want to pursue a change, it should be at least an issue if not a design document.

That said, I would not include the proposed note in this comment thread. Any changes we consider or make will be done in the near future. If we include text about it, put it in syntax.md.

; Note that the following expression is a simplification,
; as this rule MUST NOT be considered to match existing keywords
; (`.input`, `.local`, and `.match`).
reserved-keyword = "." name

; reserve additional sigils for use by
; future versions of this specification
reserved = reserved-start reserved-body
reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~"
reserved-body = *( [s] 1*(reserved-char / reserved-escape / quoted))
; Reserve additional sigils for use by future versions of this specification.
reserved-annotation = reserved-annotation-start reserved-body
reserved-annotation-start = "!" / "@" / "#" / "%" / "*"
/ "<" / ">" / "/" / "?" / "~"

reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be separate productions for reserved statement bodies and reserved annotation bodies, so the body of reserved/private-use annotations better aligns with that of function annotations.

Suggested change
reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted))
; Each space-separated part is an arbitrary mixture of text and expressions,
; optionally followed by a quoted literal (such that no quoted literal may be
; directly followed by text or an expression, prohibiting e.g. `|a|b` and `|a||b|`).
reserved-annotation-body = *(s reserved-part)
reserved-part = 1*(reserved-char / reserved-escape / expression) [quoted]
/ quoted

reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
Comment on lines 68 to 70
Copy link
Collaborator

@gibson042 gibson042 Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reserved-char should omit . so e.g. .new {$a} .hotness {$b} is recognized as two distinct reserved statements and .prefix {$foo} .local $var = {|val|} is recognized as a reserved statement followed by a local declaration, while still allowing reserved statements to include non-consecutive expressions (like local-declaration = local s variable [s] "=" [s] expression already does).

Suggested change
reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
reserved-char = %x00-08 ; omit HTAB and LF
/ %x0B-0C ; omit CR
/ %x0E-19 ; omit SP
/ %x21-2D ; omit .
/ %x2F-5B ; omit \

And accordingly, reserved-escape should be generalized to e.g. reserved-statement-escape = backslash ( %x00-D7FF / %xE000-10FFFF ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this precludes syntax like that used now in local-declaration, where we have a variable name at the top level, and name can include a .. Is that intentional?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it will need further tweaking to match . only after name-start or name-char, as inside name.

Expand All @@ -67,6 +73,10 @@ reserved-char = %x00-08 ; omit HTAB and LF
/ %x7E-D7FF ; omit surrogates
/ %xE000-10FFFF

; Reserve sigils for private-use by implementations.
private-use-annotation = private-start reserved-body
private-start = "^" / "&"

; identifier matches https://www.w3.org/TR/REC-xml-names/#NT-QName
; name matches https://www.w3.org/TR/REC-xml-names/#NT-NCName
identifier = [namespace ":"] name
Expand Down