-
-
Notifications
You must be signed in to change notification settings - Fork 36
Implement code more introducer #529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I was hoping to land the namespace changes before embarking on this, which is why I didn't do it yesterday. But worth resolving stuff before we change the ABNF for reals. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the ABNF hard to understand.
Future keywords will be lowercase ASCII, so require an unquoted pattern starting with .
to be escaped or quoted.
Based on feedback from @aphillips and @gibson042, I changed it so a leading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the grammar is stable, don't forget to update syntax.md as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making progress. Some of my suggestions take a different route to the same destination for reasons I call out. Pay attention to the whitespace issue.
|
||
complex-message = "{{" [s] *(declaration [s]) body [s] "}}" | ||
simple-message = [simple-start pattern] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the [
/]
is super-subtle and may be worth a comment. These are there to make the empty string a valid message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let's make sure this is called out in syntax.md
.
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Updated again. I dropped I'd also strongly prefer not to add comments to the |
spec/message.abnf
Outdated
; Note that the following expression is a simplification, | ||
; as this rule MUST NOT be considered to match existing keywords: | ||
; (input / local / match) (s / "{") | ||
reserved-statement = "." reserved-body 1*expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think requiring reserved-statement
to end with at least one expression
matches too little.
reserved-statement = "." reserved-body 1*expression | |
; A reserved statement continues until the next statement or quoted-pattern, | |
; and may include any number of expressions and/or quoted literals | |
reserved-statement = "." *([s] reserved-statement-part) | |
reserved-statement-part = 1*(reserved-statement-char / reserved-statement-escape) | |
/ expression | |
/ quoted | |
reserved-statement-char = %x00-08 ; omit `s`: HTAB and LF | |
/ %x0B-0C ; omit `s`: CR | |
/ %x0E-19 ; omit `s`: SP | |
/ %x21-2D ; omit . | |
/ %x2F-5B ; omit \ | |
/ %x5D-7A ; omit { | } | |
/ %x7E-D7FF ; omit surrogates | |
/ %xE000-10FFFF | |
reserved-statement-escape = backslash ( %x00-D7FF / %xE000-10FFFF ) |
Note that this locks in the expression
syntax for effectively forever because a left brace could start a {{…}}
quoted expression which would terminate the reserved statement—though possible to work around that too; it's rather impractical. And likewise for quoted
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The .
exclusion also disallows using name
within a future statement, so e.g. introducing local-declaration
would not be possible if we didn't include it already.
On the other hand, if each declaration
had to end in a required whitespace s
, we would only need to bar the sequence (s ".")
from what's reserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibson042 Please correct me if I misremember, but my understanding from today is that we're fine with reserving only statements that end in an expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibson042 Please correct me if I misremember, but my understanding from today is that we're fine with reserving only statements that end in an expression?
That is not my recollection, and would preclude possibilities like .strict true
. However, I'm fine with addressing it in a followup.
@aphillips I'd like to discuss the open threads here on today's call, if possible:
|
I've now resolved the discussions as agreed, and dropped the During the call, we planned to merge this provided that it receives some approvals, and for a separate PR to apply the further spec changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
૮꒰ ˶• ༝ •˶꒱ა
Sheep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR still has a few issues that IMO should be fixed before it lands.
complex-body = quoted-pattern | ||
/ ((selectors / reserved-statement) 1*([s] variant)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may prove valuable to establish a greater separation between selectors
and reserved-statement
here.
complex-body = quoted-pattern | |
/ ((selectors / reserved-statement) 1*([s] variant)) | |
complex-body = quoted-pattern | |
/ (selectors 1*([s] variant)) | |
/ (reserved-statement 1*([s] variant)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you say why? The two expressions are wholly synonymous, and I don't really see why we should prefer one over the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because I expect the latter to diverge—I don't think we should treat .match {…}…
and .<reserved> …
as interchangeable prefixes before a list of variants, but rather .match {…}… <variant>…
as one thing complete in itself and .<reserved> …
as something independent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motiviation for putting reserved-statement
inside complex-body
? Do we expect other multivariant constructs than match
? I think this may be building too much flexibility into the spec. Could we instead agree that any future keywords would go with other declarations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I now see that @gibson042 mentions this as one of the issues to work on as follow-ups in his review. How about we land this PR without reserved-statement
in complex-body
and then consider adding it, rather than landing it with doubts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I would find it easier to include it, show what impact it has across the stack, and then evaluate whether to drop it.
We have already agreed on having two follow up PRs to this that will both re-tread this ground. Could we agree that this will be considered more in both of those, and accept having it in the ABNF for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I would find it easier to include it, show what impact it has across the stack, and then evaluate whether to drop it.
We should do things exactly the other way.
I also don't appreciate pushing hard for merging. We should merge when things look good to merge.
spec/message.abnf
Outdated
; Note that the following expression is a simplification, | ||
; as this rule MUST NOT be considered to match existing keywords: | ||
; (input / local / match) (s / "{") | ||
reserved-statement = "." reserved-body 1*expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibson042 Please correct me if I misremember, but my understanding from today is that we're fine with reserving only statements that end in an expression?
That is not my recollection, and would preclude possibilities like .strict true
. However, I'm fine with addressing it in a followup.
spec/message.abnf
Outdated
private-use = private-start reserved-body | ||
private-start = "^" / "&" | ||
; Reserve additional .keywords for use by future versions of this specification. | ||
reserved-statement = reserved-keyword [s [reserved-body [s]]] 1*expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails to allow whitespace between the terminal expressions, and more importantly suffers from the reserved-body
problem mentioned below.
reserved-statement = reserved-keyword [s [reserved-body [s]]] 1*expression | |
; A reserved statement is a reserved keyword followed by an | |
; optional sequence of space-separated parts and | |
; a final sequence of one or more expressions. | |
reserved-statement = reserved-keyword [*(s reserved-part) s reserved-statement-last] 1*([s] expression) | |
reserved-part = 1*(reserved-char / reserved-escape / expression) [quoted] | |
/ quoted | |
reserved-statement-last = *(reserved-char / reserved-escape / expression) (reserved-char / reserved-escape) | |
/ 1*(reserved-char / reserved-escape / expression) quoted | |
/ quoted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular benefit to parsing out the spaces within reserved content? Originally in #374 (comment) we took them out of reserved-char
to effectively trim the reserved expression contents within the ABNF.
Should we perhaps revisit that decision, and simplify some of this by not splitting the reserved text into parts?
In the meantime, good point about the missing [s]
between expressions. If we leave the reserved cleanup to later, this should work for the current line:
reserved-statement = reserved-keyword [s [reserved-body [s]]] 1*expression | |
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular benefit to parsing out the spaces within reserved content? Originally in #374 (comment) we took them out of
reserved-char
to effectively trim the reserved expression contents within the ABNF.
I think there is; it allows us to more accurately capture adjacency, especially when it comes to things like consecutive declarations (ok but should remain separate) and directly adjacent quoted literals (should be rejected IMO) and—as you mentioned in another comment—dots, which should be accepted as part of a name but not when they appear to start a new statement.
|
||
reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be separate productions for reserved statement bodies and reserved annotation bodies, so the body of reserved/private-use annotations better aligns with that of function annotations.
reserved-body = *([s] 1*(reserved-char / reserved-escape / quoted)) | |
; Each space-separated part is an arbitrary mixture of text and expressions, | |
; optionally followed by a quoted literal (such that no quoted literal may be | |
; directly followed by text or an expression, prohibiting e.g. `|a|b` and `|a||b|`). | |
reserved-annotation-body = *(s reserved-part) | |
reserved-part = 1*(reserved-char / reserved-escape / expression) [quoted] | |
/ quoted |
reserved-char = %x00-08 ; omit HTAB and LF | ||
/ %x0B-0C ; omit CR | ||
/ %x0E-19 ; omit SP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reserved-char
should omit .
so e.g. .new {$a} .hotness {$b}
is recognized as two distinct reserved statements and .prefix {$foo} .local $var = {|val|}
is recognized as a reserved statement followed by a local declaration, while still allowing reserved statements to include non-consecutive expressions (like .local-declaration = local s variable [s] "=" [s] expression
already does)
reserved-char = %x00-08 ; omit HTAB and LF | |
/ %x0B-0C ; omit CR | |
/ %x0E-19 ; omit SP | |
reserved-char = %x00-08 ; omit HTAB and LF | |
/ %x0B-0C ; omit CR | |
/ %x0E-19 ; omit SP | |
/ %x21-2D ; omit . | |
/ %x2F-5B ; omit \ |
And accordingly, reserved-escape
should be generalized to e.g. reserved-statement-escape = backslash ( %x00-D7FF / %xE000-10FFFF )
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this precludes syntax like that used now in local-declaration
, where we have a variable
name
at the top level, and name
can include a .
. Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it will need further tweaking to match .
only after name-start
or name-char
, as inside name
.
Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
private-use = private-start reserved-body | ||
private-start = "^" / "&" | ||
; Reserve additional .keywords for use by future versions of this specification. | ||
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying to @gibson042 from #529 (comment):
Please correct me if I misremember, but my understanding from today is that we're fine with reserving only statements that end in an expression?
That is not my recollection, and would preclude possibilities like
.strict true
. However, I'm fine with addressing it in a followup.
Could we merge this PR with this note in the ABNF, in expectation of getting a PR from @gibson042 that could update all reserved content holistically?
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) | |
; NOTE: Rules for reserved statements and annotations | |
; should be considered provisional, pending a near-future PR updating them. | |
reserved-statement = reserved-keyword [s reserved-body] 1*([s] expression) |
ps. I had not considered statements like .strict true
. That does sound like something we ought to reserve space for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason statements end with expression
is that the closing }
of the expression is how we know the statement ended. Otherwise we're in reserved-body
... forever. It is a hidden detail of our syntax that statements end with a close bracket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that's true with the current state of this PR, I believe @gibson042 would like us to drop .
as a valid character in reserved-body
, which would allow determining its end by encountering it at the start of the next keyword.
My request here is for us to find a way to not block this PR on that discussion, and to have it instead separately so that the syntax of both reserved statements and annotations could be considered in one package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with finalizing this PR. I don't think the reserved-statement discussion (were we to change something) needs to be in the way of that. It should be a separate conversation. If we want to pursue a change, it should be at least an issue if not a design document.
That said, I would not include the proposed note in this comment thread. Any changes we consider or make will be done in the near future. If we include text about it, put it in syntax.md.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have issues with syntactic reservations in this PR:
- use of
reserved-statement
as interchangeable with a.match
head rather than a full substitute for.match
with its variants - requiring
reserved-statement
to end with a list of expressions, precluding e.g..strict true
- ...and greedy consumption fails to capture the intuition of multiple statements in e.g.
.strict true .local $var = {|val|}
, instead parsing that as a single reserved statement subsuming the.local
- ...and greedy consumption fails to capture the intuition of multiple statements in e.g.
- reusing the
reserved-annotation
reserved-body
forreserved-statement
, precluding e.g..something opt1={$var} opt2={$var}
because of the intent to prohibit e.g. nested expressions in annotations like{@reserved {|…|}}
But that said, it is coherent and can be used as a foundation for further iteration.
complex-body = quoted-pattern | ||
/ ((selectors / reserved-statement) 1*([s] variant)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motiviation for putting reserved-statement
inside complex-body
? Do we expect other multivariant constructs than match
? I think this may be building too much flexibility into the spec. Could we instead agree that any future keywords would go with other declarations?
/ %x7C ; omit } | ||
/ %x7E-D7FF ; omit surrogates | ||
/ %xE000-10FFFF | ||
simple-start-char = %x0-2D ; omit . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it looks like we don't append -char
to other -start
productions in our grammar.
simple-start-char = %x0-2D ; omit . | |
simple-start = %x0-2D ; omit . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there already is another rule that's called simple-start
. If necessary, could we take care of any such editorial rule renames in a separate follow-up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I'll open a new PR for this.
@stasm asked:
I think it was me who asked for Perhaps this should be: complex-body = quoted-pattern
/ (selectors 1*([s] variant))
/ reserved-statement ; which matches some-newly-unreserved-statement 1*([s] variant)
; for some future statement |
I discussed with @eemeli in Slack and we agreed to merge this as-is, and iterate further starting from @gibson042's feedback in his reivew above: #529 (review). |
Merging this to allow further progress. Outstanding discussions on reserved-statements and other discussions will be tracked in separate issues and/or PRs. |
#548) * Update the rest of the spec to match the ABNF after #529 * Apply suggestions from code review Co-authored-by: Richard Gibson <richard.gibson@gmail.com> * Remove reserved-statement from matcher * Update spec/syntax.md --------- Co-authored-by: Richard Gibson <richard.gibson@gmail.com> Co-authored-by: Addison Phillips <addison@unicode.org>
This starts implementing the changes for the code mode introducer (#521, #526), including dropping
when
.Before I get any further, it might be best for us to align on the exact ABNF changes. See line comments for a few specific questions.