Skip to content

Add design doc for error handling #804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 5, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions exploration/error-handling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
# Error Handling

Status: **Accepted**

<details>
<summary>Metadata</summary>
<dl>
<dt>Contributors</dt>
<dd>@echeran</dd>
<dt>First proposed</dt>
<dd>2024-06-02</dd>
<dt>Issues</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/issues/782">#782</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/issues/830">#830</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/issues/831">#831</a></dd>
<dt>Pull Requests</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/795">#795</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/pull/804">#804</a></dd>
<dt>Meeting Notes</dt>
<dd><a href="https://github.com/unicode-org/message-format-wg/blob/main/meetings/2024/notes-2024-05-06.md">2024-05-06</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/blob/main/meetings/2024/notes-2024-05-13.md">2024-05-13</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/blob/main/meetings/2024/notes-2024-05-20.md">2024-05-20</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/blob/main/meetings/2024/notes-2024-07-15.md">2024-07-15</a></dd>
<dd><a href="https://github.com/unicode-org/message-format-wg/blob/main/meetings/2024/notes-2024-07-22.md">2024-07-22</a></dd>
</dl>
</details>

## Objective

Decide whether and what implementations "MUST" / "SHOULD" / "MAY" perform after a runtime error, regarding:

1. information about error(s)
- including, if relevant, the minimum number of errors for which such information is expected
1. a fallback representation of the message

## Background

In practice,
runtime errors happen when formatting messages.
It is useful to provide information about any errors back to the callsite.
It is useful to the end user to provide a best effort fallback representation of the message.
Comment on lines +38 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had some good discussions in telecon about these cases. We should use the design doc to capture some of this for posterity. I'm going to suggest a use case section below.

Specifying the behavior in such cases promotes consistent results across conformant implementations.

However, implementations of MessageFormat 2.0 will be faced with different constraints due to various reasons:

* Programming language: the language of the implementation informs idiomatic patterns of error handling.
In Java, errors are thrown and subsequently caught in `try...catch` block.
In Rust, fallible callsites (those which can return errors) should return a `Result<T, Err>` monad.
In both languages, built-in error handling assumes a singular error.
* Environment constriants: as mentioned in [feedback from ICU4X](https://github.com/unicode-org/message-format-wg/issues/782#issuecomment-2103177417),
ICU4X operates in low resource environments for which returning at most 1 error is desirable
because returning more than 1 error would require heap allocation.
* Programming conventions and idioms: in [feedback from ICU-TC](https://docs.google.com/document/d/11yJUWedBIpmq-YNSqqDfgUxcREmlvV0NskYganXkQHA/edit#bookmark=id.lx4ls9eelh99),
they found over the 25 years of maintaining the library that there was more cost than benefit in additionally providing error information with a default best effort return value compared to just returning the default best effort value.
The additional constraint in ICU4C's C++ style to return an error code rather than throwing errors using the STL further complicates the usefulness and likelihood to be used correctly by developers, especially during nested calls.

> [!NOTE]
> The wording in this document uses the word "signal" in regards to providing
> information about an error rather than "return" or "emit" when referring to
> a requirement that an implementation must at least indicate that an error has
> occurred.
> The word "signal" better accomodates more alternatives in the solution space
> like those that only choose to indicate that an error occurred,
> while still including those that additionally prefer to return the error
> itself as an error object.
> (By contrast, "return an error" implies that an error object will be thrown or
> returned, and "emit an error" is ambiguous as to what is or isn't performed.)
## Use Cases

As a software developer, I want message formatting calls to signal runtime errors
in a manner consistent with my programming language/environment.
I would like error signals to include diagnostic information that allows me to debug errors.

As a software developer, I sometimes need to be able to emit a formatted message
even if a runtime error has occurred.

As a software developer, I sometimes want to avoid "fatal" error signals,
such as might occur due to unconstrained inputs,
errors in translation of the message,
or other reasons outside my control.
For example, in Java, throwing an Exception is a common means of signaling an error.
However, `java.text.NumberFormat` provide both throwing and non-throwing
`parse` methods to allow developers to avoid a "fatal" throw of `ParseException`
(if the exception were uncaught).

As a MessageFormat implementer, I want to be able to signal errors in an idiomatic way
for my language and still be conformant with MF2 requirements.

## Accepted Design

The following design was selected in #830.

### MUST signal errors and MUST provide fallback

* Implementations MUST provide a mechanism for signaling errors. There is no specific requirement for what form signaling an error takes.
* Implementations MUST provide a mechanism for getting a fallback representation of a message that produces a formatting or selection error. Note that this can be entirely separate from the first requirement.
* An implementation is not conformant unless it provides access to both behaviors. It is compliant to do both in a single formatting attempt.

> In all cases, when encountering an error,
> a message formatter MUST be able to signal an error or errors.
> It MUST also provide the appropriate fallback representation of the _message_ defined
> in this specification.

This alternative requires that an implementation provide both an error signal
and a means of accessing a "best-effort" fallback message.
This slightly relaxes the requirement of "returning" an error
(to allow a locally-appropriate signal of the error).

Under this alternative, implementations can be conformant by providing
two separate formatting methods or functions,
one of which returns the fallback string and one of which signals the error.

Similar to the current spec text,
this alternative requires implementations to provide useful information:
both a signal that an error occurred and a best effort message.
A downside to this alternative is that these requirements together assume that
all implementations will want to pay the cost of constructing a representative mesage
after the occurrence of an error.

## Alternatives Considered

### Current spec: require information from error(s) and a representative best effort message

The current spec text says:

> In all cases, when encountering a runtime error,
> a message formatter MUST provide some representation of the message.
> An informative error or errors MUST also be separately provided.

This alternative places constraints on implementations to provide multiple avenues of useful information (to the callsite and user).

This alternative establishes constraints that would contravene the constraints that exist in projects that have implemented MF 2.0 (or likely will soon), based on:
* programming language idioms/constraints
* execution environment constraints
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What execution environment constraints does this alternative contravene? The only such mentioned in this document is that the cost of returning more than one error may be prohibitive in some cases, and the current text explicitly says "error or errors" to allow for an implementation signaling a single error to be valid.

Suggested change
* execution environment constraints

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says "an informative error". As written, that implies to me an error object, like a java.lang.Throwable instance. Despite discussion in last week's meeting to interpret that phrase as equivalent to "whether or not an error occurred", it comes across distinctly as something more and thus should be rewritten if the intent is not so. There are applications like ICU and potentially browsers that might only want to provide a best effort message and signal an error, but not pay the cost of creating "informative error" objects each time.

As I mentioned in the 2024-04-09 meeting, another paradigm from which to look at this, besides "whether is returning an error possible", is "how actionable is returning the error object". ICU & browsers need to be performant and might not want to pay the cost of the creating a full error object.

* experience-based programming guidelines

For example, in ICU,
[the suggested practice](https://docs.google.com/document/d/11yJUWedBIpmq-YNSqqDfgUxcREmlvV0NskYganXkQHA/edit#bookmark=id.lx4ls9eelh99)
is to avoid additionally returning optional error codes when providing best-effort formatted results.

### MUST signal errors and SHOULD provide fallback

* Implementations MUST provide a mechanism for signaling errors. There is no specific requirement for what form signaling an error takes.
* Implementations SHOULD provide a mechanism for getting a fallback representation of a message that produces a formatting or selection error. Note that this can be entirely separate from the first requirement.
* Implementations are conformant if they only signal errors.

### SHOULD signal errors and MUST provide fallback

* Implementations SHOULD provide a mechanism for signaling errors. There is no specific requirement for what form signaling an error takes.
* Implementations MUST provide a mechanism for getting a fallback representation of a message that produces a formatting or selection error. Note that this can be entirely separate from the first requirement.
* Implementations are conformant if they only provide a fallback representation of a message.


### Error handling is not a normative requirement

* Implementations are not required by MF2 to signal errors or to provide access to a fallback representation.
- The specification provides guidance on error conditions; on what error types exist; and what the fallback representation is.

> When encountering an error during formatting,
> a message formatter MAY provide some representation of the message,
> or it MAY provide an informative error or errors.
> An implementation MAY provide both.

This alternative places no expectations on implementations,
which supports the constraints we know now,
as well as any possible constraints in the future
(ex: new programming languages, new execution environments).

This alternative does not assume or assert that some type of useful information
(error info, representative message)
will be possible and should be returned.

### Alternate wording

> When an error is encountered during formatting,
> a message formatter can provide an informative error (or errors)
> or some representation of the message or both.