Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add error handling #320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add error handling #320
Changes from all commits
1a27ccd
4389ae5
55890b9
99ce4a1
da1ecaf
4d8c753
2e1f943
38e33ad
8e12260
22866fd
d46856f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick / editorial:
How do you feel about having a small list with titles only, almost like a TOC
Followed by the details, with examples.
It would help with the "big picture", vs something spread over many paragraphs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how this started out, before the examples got added...
Let's revisit this when we've more of the spec gathered up? The solution for this section is likely to look very similar to other sections as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you clarify this one? Is it illegal to assign the empty string or null to a variable? I think it might be a feature in some situations...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the
let
line is fine; it's there just to indicate that the source isn't completely empty. The failure here is from there being nothing after thelet
, i.e. no Pattern or Selector+Variants.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The empty pattern is not an error, though, right? It's not a very useful message, but it's not an error, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's an error because it's missing the
{}
wrappers. Something like this would be a valid empty message:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're saying that the empty string is not a valid pattern and results in an error? That is, the minimum a pattern string can ever be is
{}
? That makes me uncomfortable....There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what the current EBNF states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:
VariantKey
Part of the editorial cleanup I'm proposing, that would add back-ticks around keywords, and would make sure this is consistent with the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer leaving this as "Variant Key" for now, as it's rather likely for this to leak out as a user-visible string in the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having too few or too many keys (compared to the match expression) is clearly a syntax error, and should be in that category. It's like calling a function in C or Java with too few or too many parameters; not a semantic or data model error, but rather one that can be detected purely because of malformed syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent in differentiating this as a "data model" rather than "syntax" error is that it allows us to apply the same error check independently of the syntax used by the message. Here's what I said about this earlier in #320 (comment):
An MF2 syntax parser would be expected to emit the data model errors during its work, as this is included in the spec (line 176):
This separation also matches the differentiation between "well-formed" and "valid" messages in the syntax spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add text to say:
(and on line 82 add "Missing
when * *
case:")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the necessity for these specifiers. Are these specific examples more difficult to respond about than the others, or should we specify explicitly what's wrong in each of them?
Regarding the examples in general, I at least prefer requiring the reader to at least momentarily pause on some of them to figure out for themselves what's going on, rather than making sure that no thought is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unclear what this means.
Can we say "function resolution"?
There are very few things that can go wrong:
We already have unresolved as a class of errors a bit below.
So, what else can go wrong in the (already parsed) parts above?
I think only function names?
So what about we call this section "Function resolution"?
Instead of "resolution errors", where "resolution" is not explained in the spec, and we don't even agree it is needed. We might agree that it is an implementation detail. So there is no need to mention it in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between the "resolution" and "formatting" errors that's proposed here is perhaps best seen by considering how a Placeholder containing an Expression is handled. Let's say that we have something like
or
where
:message
and/or:global
is a custom Function that uses its argument to look up a value from elsewhere, and that this value is then formatted as a part of the final message.During the formatting of this, the custom code could then emit two different sorts of errors:
foo
message is available or if theuser_age
global is not set.foo
message includes a variable reference that can't be resolved, or theuser_age
value turns out to have some unexpected shape.Would you agree that these are different error categories, and that this categorical split could lead to different error handling in user code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
user_age
isn't set, isn't that an unresolved variable error?If no
foo
message is available, that would be an internal error of the format functionmessage
, right? Does that mean "resolution error" is really "function internal error"?In any case, I think I would move this item below some of the others here (perhaps to the bottom of the list), since I find myself thinking that this could also mean unresolved or formatting error when in fact this error would probably only occur later (only when the pattern is syntactically correct and all of the variables and functions have been resolved but there is still a problem).
Or am I still not understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both
foo
anduser_age
are literal values here, so the errors are coming from the:message
and:global
custom functions; from the PoV of the core implementation, the are no variables here to resolve.The intent here is to allow for a custom formatting function to emit two different kinds of errors: Resolution and Formatting. This is meant to enable something like
:message
or:global
to work from an end-user PoV as much as possible like core features such as$var
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I don't think what I am saying here affects the spec, but the answer above.
I think it is contradicting:
It is unclear what is the difference between the two: "get the value" and "resolve variable reference"
The "publicly visible" operation is probably "I have a variable named
foo
, I want to get the value"There is no "reference" except deep in the implementation.
But that implementation detail should not "leak" in the kind of error I am getting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that the syntax example for the above discussion is this placeholder:
The
foo
here is not a variable reference, it's a literal value that a custom:message
function is interpreting as a message identifier.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one bullet (Selection errors) with only one sub-bullet (Selector errors)
Is there a difference? Are there any other kind of "Selection errors" other then "Selector errors"
Otherwise feels redundant, like having:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave the selection + selector error categories as is for now, but adjust them later after #322 is resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal: add a heading above it?
Probably ### level.
Because we go from the list of error types (what we produce), with examples for each, to a section explaining how / when to produce said errors.
It probably made sense when the list errors was a real list, visible at a glance.
But now it gets lost after the more detailed form.
If you think it helps also add a ### level heading before error types.
Proposal:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above: This phrasing doesn't explicitly mandate that errors should chain (or nest, depending how you look at it). I.e a resolution error can lead to a formatting error. Implementations should emit both to enable higher-level abstractions to react differently to these different failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this lead to unresolved variable errors later that would otherwise be avoided? See my comment about allowing the empty variable above. Wouldn't it be better to have just the one error for:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's tricky, because determining whether accessing
{$var}
here causes a second error would require us to define what the value of{(horse) :unknown_function}
is. But maybe we do need to define that a little bit, because we should make it clear whether this message ought to get formatted asor as
I'll add a statement clarifying that it should be the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... because the first error (
:unknown_function
) assigned the fallback string to$var
and the second line just replaces{$var}
with that, right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of, but not quite. It's probably easiest to think of the value of
$var
here as an implementation-defined blob that is identifiable as a fallback value and which stringifies as(horse)
. Then, when it's formatted in the placeholder{$var}
, it comes out as the string{(horse)}
.This indirection allows for us to consider the fallback value and the target of the formatting separately from each other, and more easily support non-string formatting targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better for each error to define the fallback representation rather than trying to collect them here? For example "The fallback string representation of an unknown function error is ..." in the section on that type of error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer defining the fallback representation according to the Placeholder for which it applies, rather than the error type. This makes them more predictable and intentionally restricts implementations from trying to be too smart about the behaviour.
Consider for instance something like
{$user :get key=name opt=$var}
, which is intended to get the user name with someopt
qualifier. If the$var
may be controlled by an attacker, it may lead to the:get
formatting to fail, which means that a fallback representation must be used. If an implementation could define some custom error for it, a "smart" solution could be to try and stringify$user
, which could lead to some sensitive information leaking out in an unexpected manner. Instead, with the explicit definition here it must use the string{$user}
, which is sufficiently opaque to avoid this failure mode.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I meant something different? I meant that instead of defining fallbacks in a separate section of the spec which then refers back to syntax and data model error types, it would potentially better to define error behavior inline with each error type.
I am not suggesting that we encourage/permit different implementations to make up their own fallback behavior or to define new error types with (suspect) fallback behavior.
One reason for my thinking is the example in the comment above:
If I'm writing an implementation, I probably have an assignment operator bit of code for the
let (x) = {expr}
. One likely way to implement that would be to make the argument list a Map keyed with strings, into which the keyvar
is put. When I do this, I'm not evaluating the rest of the pattern--to know how or even if the value ofvar
is used. But I do need to assignvar
a value in thelet
expression. Defining the output of the "unknown function" error gives me clear guidance. Does that make sense? (This also helps answer @stasm's question about multiple errors)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might have a bidi consideration here? If the string being formatted is in Arabic, we might emit an FSI before the
{
and a PDI after the}
. Format patterns use lots of neutrals (such as$
and:
) and look like gibberish in a bidi context.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be covered by the text in PR #315? The intent there is to define a prefix and a suffix to a part's formatted string representation, which here would be something like
{$foo}
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be an error? Ever?
It is a string literal, to use "as is"
I agree that it can make sense if a function is specified:
{(42) :number}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The examples are showing examples of the fallback string representations, so .e.g
{(42)}
would be the one used if{(42) :number}
failed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal: change this to "the string between
{
and}
, as is.I would assume we get here is none of the cases before was encountered.
So probably a syntax error (bad sigil, for example, or bad format for parameters):
"Encountered {?count} penguins"
is more useful than"Encountered {�} penguins"
Or
"Encountered {$count :number invalid options ?? format} penguins"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is catching anything not covered by the above. I think that's currently an empty set, as syntax errors use
{�}
or{fallback string}
for the whole message. So this should never happen, but let's still include it to make sure that we don't end up with undefined behaviour.I would strongly prefer not including any explicit reference to syntax source here, as that would require keeping a reference to it in all implementations, and might not apply at all if the source is not an MF2 syntax message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I don't think the second example should be valid
Because the whole point of local variables would be reuse, mostly for consistency.
If change the function in the message body, the whole consistency goes out the windows.
Yes, not difficult to implement (just a bit worse performance).
But kind of no good use case, with potential of misuse.
So my suggestion would be to remove the second example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is meant as a minimal example showing that the identifiers of local variables should never end up being included in the formatted output. It's not really meant to be a useful message, but it's technically valid.