Skip to content

Add design doc for error handling #804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 5, 2024

Conversation

echeran
Copy link
Collaborator

@echeran echeran commented Jun 3, 2024

Add design doc to contextualize the PR #795 based on the issue & discussion begun in #782 .

@aphillips aphillips added design Design document or issues related to design errors Issues related to the errors section of the spec labels Jun 3, 2024
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Probably too close to the meeting today to merge, but let's see.

Comment on lines 52 to 55
> In all cases, when encountering an error during formatting,
> a message formatter MUST provide some representation of the message,
> or MUST provide an informative error or errors.
> An implementation MAY provide both.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the fact of an error is more important than the fallback representation. Showing the fallback representation to end users is less good than having good error recovery (or no error at all). This wording makes the fallback representation first and thus, seemingly, the more important thing.

I also find the "MUST or MUST" formulation to be weird. I think this is trying to say "MUST provide some representation of the message or an informative error or errors or both"

I think it would be weird if an implementation never failed, that is, if it returned the fallback representation silently with no signal of failure. When would this be a good thing? How would the caller find out about the error.

Note that we do not specify how errors are signaled. Just because (for example) Java often throws Throwable does not mean that the MF2 implementation has to use that as the error mechanism. See, for example, the use of ParsePosition in NumberFormat.

Thus I'd suggest an additional alternative:

Suggested change
> In all cases, when encountering an error during formatting,
> a message formatter MUST provide some representation of the message,
> or MUST provide an informative error or errors.
> An implementation MAY provide both.
> In all cases, when encountering an error,
> a message formatter MUST provide an informative error or errors.
> It MAY also provide the appropriate fallback representation of the _message_ defined
> in this specification.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied most of the suggested change in order to address the simplification of the "MUST or MUST" construction.

Now, the reason for that verbose wording was to be as least restrictive as possible. It's basically saying, "You do something when you get an error. You must provide a best effort message or an error, maybe both. You can't do nothing."

The point about requiring signaling that an error occurred is an extra constraint beyond the existing text. I think it is a reasonable constraint, so I will incorporate that. But I'm not comfortable strengthening that constraint anything further because as we discussed in Monday's meeting, there is a big difference in "error" (ex: instance of java.lang.Exception) versus "signal that an error occurred" (ex: have just a boolean, or have a "strict version alternate API"). In order to avoid ambiguity about that, I much prefer "signal an error" rather than "provide an error" (this potential ambiguity is why I prodded us in Monday's meeting to be specific about what we mean by "provide/return an error")

I also think it would help avoid ambiguity to say "be able to signal" rather than "signal" so that we more clearly support implementations that choose to meet the requirement with an alternative that uses 2 APIs (ex: an infallible best-effort and a strict fallible one that signals an error). I think that would also address Tim's concern in his review comment since we have all been comfortable allowing choice by implementations since the time it was brought up 2 months ago.

This solution requires implementations to return _something_,
but it leaves the decision to the implementation whether to:

* return an error (or errors)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably say:

Suggested change
* return an error (or errors)
* return or emit an error or errors

One reason is that throwing an exception (in languages that do that) is not the same as returning a return value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "signal" instead of "emit"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the word "signal" instead of "emit" because that better accomodates all of the alternatives in the solution space.

Note: I had to rewrite this section to reflect the extra constraint that I am taking on from your suggestion in the other conversation thread, since this section existed to clarify the spec text concretely and in plain words.

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on yesterday's call, I think we need to require two different things:

  1. An implementation must provide a way for a message with a runtime error to be formatted in a way that provides the user with that error.
  2. An implementation must provide a way for a message with a runtime error to be formatted using a fallback representation.

These separate requirements could be fulfilled by a single formatting method, or by more than one such method.

Copy link
Collaborator

@catamorphism catamorphism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, if Addison's suggestion about eliminating the "MUST or MUST" phrasing is incorporated.

Per Eemeli's comment, maybe the idea of having the implementation provide two different methods should be listed as one of the alternatives? The current language mentions "the message formatter" and it's unclear from that whether the formatter has one formatting method or several.

@echeran
Copy link
Collaborator Author

echeran commented Jun 7, 2024

As discussed on yesterday's call, I think we need to require two different things:

  1. An implementation must provide a way for a message with a runtime error to be formatted in a way that provides the user with that error.
  2. An implementation must provide a way for a message with a runtime error to be formatted using a fallback representation.

These separate requirements could be fulfilled by a single formatting method, or by more than one such method.

@eemeli PTAL at the changes. See the discussion in this conversation within this PR where I address this. Also take a look at the wording and rationale for it based on our discussion in Monday's meeting, in particular my intent to be precise on what we're requiring & not, and whether that means the appropriate language is "returning"/"emitting"/"signalling" an error.

@echeran echeran requested review from eemeli and aphillips June 7, 2024 00:07
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document continues to not present and consider the alternative discussed on the call and in my preceding comment, of separating the two MUST requirements more clearly from each other.


This alternative establishes constraints that would contravene the constraints that exist in projects that have implemented MF 2.0 (or likely will soon), based on:
* programming language idioms/constraints
* execution environment constraints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What execution environment constraints does this alternative contravene? The only such mentioned in this document is that the cost of returning more than one error may be prohibitive in some cases, and the current text explicitly says "error or errors" to allow for an implementation signaling a single error to be valid.

Suggested change
* execution environment constraints

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says "an informative error". As written, that implies to me an error object, like a java.lang.Throwable instance. Despite discussion in last week's meeting to interpret that phrase as equivalent to "whether or not an error occurred", it comes across distinctly as something more and thus should be rewritten if the intent is not so. There are applications like ICU and potentially browsers that might only want to provide a best effort message and signal an error, but not pay the cost of creating "informative error" objects each time.

As I mentioned in the 2024-04-09 meeting, another paradigm from which to look at this, besides "whether is returning an error possible", is "how actionable is returning the error object". ICU & browsers need to be performant and might not want to pay the cost of the creating a full error object.

@echeran
Copy link
Collaborator Author

echeran commented Jun 10, 2024

This document continues to not present and consider the alternative discussed on the call and in my preceding comment, of separating the two MUST requirements more clearly from each other.

You did say that, although even after consulting the notes from the meeting, I couldn't understand what you meant by separating the 2 MUST requirements more clearly. If it's a matter of saying that you can satisfy the intended requirements in different ways, including having 2 APIs side-by-side in which one is stricter and signals an error while the other is permissive and always returns a best-effort value, the proposed alternative does that.

The only thing that seemed close to matching your description of what you'd like is the following excerpt from the meeting's notes doc, which sounds much like the current text, so I didn't really see its distinguishing features enough to make it a new alternative:

EAO: My position is that we should impose some requirements on implementations - what they must be able to provide. The way I would put the requirements - there must be a way to format the message that you get a fallback and that you must be able to format the message and get failure reported to you somehow. This would require behaviors from the implementer.

If you can write out the alternative you're thinking with pros & cons listed, I would be okay adding it.

@eemeli
Copy link
Collaborator

eemeli commented Jun 10, 2024

This document continues to not present and consider the alternative discussed on the call and in my preceding comment, of separating the two MUST requirements more clearly from each other.

You did say that, although even after consulting the notes from the meeting, I couldn't understand what you meant by separating the 2 MUST requirements more clearly. If it's a matter of saying that you can satisfy the intended requirements in different ways, including having 2 APIs side-by-side in which one is stricter and signals an error while the other is permissive and always returns a best-effort value, the proposed alternative does that.

While the proposed alternative does allow for an API as you describe, it does not require that a fallback representation is made available. The spec text that's currently proposed

In all cases, when encountering an error,
a message formatter MUST be able to signal an error or errors.
It MAY also provide the appropriate fallback representation of the message defined
in this specification.

allows for a formatter that either returns the formatted result or null on error, as the latter can act as a signal that an error has occurred. This is not sufficient.

The requirement of a way to get a formatted representation of messages with runtime errors is important, because it needs to be accounted for in the formatter design: A formatter cannot just bail on the first error, but must have the capability of continuing to format a message past the first error.

The inclusion of "In all cases" also does not make it clear that an implementation providing two formatting methods (a strict thrower and a lenient fallback formatter) would be valid, given that formatting with the latter would not necessarily signal any error. This is an aspect of the current spec language that I believe for us to have consensus on fixing.

If you can write out the alternative you're thinking with pros & cons listed, I would be okay adding it.

I'll see if I have time for that later this week, but it may take until next week.

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
@echeran echeran requested review from aphillips and eemeli June 12, 2024 18:15
Comment on lines 96 to 100
* experience-based programming guidelines

For example, in ICU,
[the suggested practice](https://docs.google.com/document/d/11yJUWedBIpmq-YNSqqDfgUxcREmlvV0NskYganXkQHA/edit#bookmark=id.lx4ls9eelh99)
is to avoid additionally returning optional error codes when providing best-effort formatted results.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to all solutions that require a runtime error to be signalled in some way, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point here is that if you're already returning a best-effort result from a formatter, then in ICU, it hasn't been useful to deal with the ErrorCodes altogether because the calling code is not looking for it. So for ICU, trying to use the ErrorCode in such cases is optional at best, but confusing and hard to reason about at worst. There's no benefit to dealing with the ErrorCode in these cases. And that's the general approach they arrived at over time dealing with formatter code. That's my interpretation & recollection of the linked discussion.

Those ICU guidelines seems reasonable for ICU, since ICU can't fail so it has to return something. Assuming "solution" = MF2 implementation, then yes, it seems reasonable and generally applicable advice for other implementations, although the guidelines that make sense to other implementations depend on each of their designs & use cases.

The fact that the current spec text takes a strong universal stance on what implementations must do, when that doesn't actually fit all use cases / guidelines / PLs, is still the crux of the matter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "solution" I meant the choice of spec language that we are debating here.

As I understand it (and please do correct me if I'm wrong), the ICU experience-based programming guideline in this case is roughly speaking, "You should not report formatting errors if you can fallback to some formatted result."

My point here is that this guideline applies to all spec text choices which require an error to be reported; for example, the current proposed design states:

In all cases, when encountering an error,
a message formatter MUST be able to signal an error or errors.

Therefore, that choice would also go against the above-mentioned experience-based programming guidelines. Is this not the case?

Just to be clear, I disagree about following this guideline for MF2, but here I'm trying to make sure that it's logically and consistently applied as it's being used as an argument in choosing between alternative designs.

Comment on lines +29 to +32
In practice,
runtime errors happen when formatting messages.
It is useful to provide information about any errors back to the callsite.
It is useful to the end user to provide a best effort fallback representation of the message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had some good discussions in telecon about these cases. We should use the design doc to capture some of this for posterity. I'm going to suggest a use case section below.

Co-authored-by: Addison Phillips <addisonI18N@gmail.com>
Co-authored-by: Eemeli Aro <eemeli@gmail.com>
@aphillips
Copy link
Member

In the 2024-06-24 call we agreed to accept this design with some changes. Waiting on changes/approvals to merge. @eemeli will then have an action to update the spec.

@aphillips aphillips added the Action-Item Action item assigned by the WG label Jul 1, 2024
@eemeli eemeli mentioned this pull request Jul 8, 2024
Copy link
Collaborator

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All suggestions are purely editorial.

Co-authored-by: Richard Gibson <richard.gibson@gmail.com>
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should merge this (thank you @echeran!!!!) and I can do some cleanup when marking it approved. The main cleanup will be to point to the vote issue, notes, discussion, etc. so that someone reading this design doc will see how we got to spec text and to make the "proposed design" exactly match the text we're still working on in #816. Let's discuss in telecon (2024-07-29)

echeran and others added 2 commits August 5, 2024 09:19
@echeran echeran merged commit c116475 into unicode-org:main Aug 5, 2024
1 check passed
@echeran echeran deleted the error-handling-design-doc branch August 5, 2024 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Action-Item Action item assigned by the WG design Design document or issues related to design errors Issues related to the errors section of the spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants