-
-
Notifications
You must be signed in to change notification settings - Fork 36
Separation of language and formatting locale #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If we want to respect POSIX, we may want to separate different fallback locale chains for different formatters (LC_TIME vs LC_MESSAGE vs LC_NUMERIC vs LC_COLLATE etc.) I'd also like us to carry chains, never single locale, to allow for better fallbacking. |
I've always though that the POSIX model "slices" things the wrong way. In general I consider the POSIX model outdated, and I would not look at it for a model. |
People have weird preferences esp. around date/time. They want German translation but with en-US date/time or reverse.
Great question. My assumption is that you format using eastern arabic numerals and French date/time pattern.
I tend to agree. At the same time, I don't think it would be terrible for us to consider allowing people to specify some locale fallback chains that are intended for particular formatters. |
I've never seen that. Stuff that is really covered under the
And what about the "am/pm"? Would that be French (am/pm), or Arabic (ص / م)? :-) Mixing things can be pretty dangerous. What I've seen working relatively well is a "separation" the keeps the lang+script the same between messages and formatters. Or This is why Android splits things in two steps: language negotiation followed by fallback. Negotiation happens once, when the application starts, going through the full list of user languages (an intersection between the locales I say I understand and the ones that have localized resources). Language fallback happens for every single attempt to load resources. Maybe not ideal, but reduces (eliminate?) confusion. |
On 2/14/2020 5:15 PM, Mihai Nita wrote:
I've always though that the POSIX model "slices" things the wrong way.
Why would I want time and numeric to use different locales?
The question to ask is: why would you make that impossible? I mean,
perhaps the answer is that this is rare enough that you can require the
definition of a one-off "locale" that combines the two formatting options.
That would have the advantage that you could explicitly resolve issues like:
Or if the numeric is Arabic (with "real" Arabic digits) and time is
French, what kind of digits would I use?
As to who would want to do something like that, I can't speculate, but
the kinds of scenarios that I could imagine would not be localization of
consumer products, but perhaps something that's an in-house app for some
large organization? Who knows. But the minute you rule something out
altogether, you have to prove a negative, I think.
Doesn't have to mean that your model should assume such mixtures as the
standard scenario.
|
Because it adds complexity to the spec & implementation with no good benefit. Basic API design: "Easy to use correctly, hard to use incorrectly" For extreme cases you can create your own formatter outside the MessageFormat with whatever locale you want. But
|
Another argument against the POSIX style: where do we stop? Why LC_TIME, but not LC_DATE? And what about LC_LIST_FORMAT, LC_DURATION, LC_INTERVAL, LC_MEASUREMENT, LC_RANGE, LC_PHONE_NUMBER? |
+100 I think it's very important to set boundaries as there are cultural conventions that simply do not apply in other languages. I think that for these cases it might be simpler to have language-specific strings without trying to solve this with the syntax itself. |
You may want to skim through, list of bugs like:
Generally speaking there are three groups of our users who are seeking some customization:
The last group was my reason to introduce dateStyle/timeStyle proposal because that allows me to override it from some customization from OS. Group (1) is the biggest outlier, and they're a pretty vocal minority who likes the inconsistency that I described in the documentation ( If there's a way to make our API accept some override for locale of date/time only, I think we'd allow for the flexibility needed for all those groups.
In my experience - date/time. I haven't seen any requests for anything else. |
I think that group 1 can be accommodated with something like creating a formatter and setting it for a certain placeholder:
The idea would be to make the most common cases very easy to support, without preventing one from supporting the "weird cases" if they want. |
On 2/17/2020 12:29 PM, Mihai Nita wrote:
||
The idea would be to make the most common cases very easy to support,
without preventing one from supporting the "weird cases" if they want.
That sounds like the approach I was advocating, although the devil is in
the details.
If some (vocal) minority of users wants some capability, is that going
to impact coding for every single message, or is this something that can
be managed more globally for an app, for example. I wasn't quite sure
from your example.
But I would not bend over to make it as friendly as the defaults.
You make all things equally easy then the developer has no good
guidance on what is "the right thing"
Having an obvious "right thing" is useful. But from some of the examples
given so far, it looked like there were competing ways of getting the
correct/same result. You may still need to publish good guidance.
|
I agree with @mihnita - good defaults should be rewarded with easy API use. let bundle = new Bundle(["de", "en-US"]);
bundle.setFormatterLocaleChain("DATETIME", ["pl", "ru", "en-US"]);
bundle.formatPattern("Today is { $date }", {
date: new Date()
}); |
Isn't that always the case :-)
True, probably guidance is a must no matter what. Taking the 3 ways of doing the same thing here: I think that case 1 is the easier one to read and write. I think (but this is just my opinion) that Zibi's example above (with If you want to do something weird it should not be easy, but discouraged by guidelines. |
It is essential for an internationalized application to respect the local format that the user is accustomed to. This is why all OSs have a locale setting that is used in formatting a date/time/number value by default. Separation of language and formatting locale is required just to meet this basic requirement. Let’s consider an international shipment tracking by a German user, whose package was shipped from a US based company. The German user comes to the US company’s website to find out when his package was shipped. The message template could be:
If the website supported German, this message should be presented fully in the German convention using a German template and the German formatting locale:
If German was not supported, the German user must use an alternate UI language. Let's say it's English and the German user may prefer:
to:
Notice that the message string is English, while the date format matches the German locale preference. This behavior is consistent with the way all OSs behave by default for localizing date, time, number and any other locale sensitive conventions. Some applications fail to respect the user’s formatting locale because they negotiate the UI locale once and then apply it to all locale sensitive operations. That is a problematic practice. Say, an application shows news articles from international sources in various languages. Then the articles should be filtered based on the user's acceptable languages. If the same article was available in different languages, then the best available language should be used, regardless of the application's UI language. If an application took the negotiate-once-use-it-everywhere approach, the user experience can be significantly degraded due to the failure to respect the user's locale preference. It is also noteworthy that many applications don’t support as many languages as they wish, so the UI language could be chosen from the user’s second languages or simply the default, which is often English, and the English locale is based on (fairly unique) American conventions. This is another reason formatting locale needs to be identified separately from language. The message template could include the number of packages, in which case it would be essential to handle pluralization based on language, while formatting the number or datetime in formatting locale.
In US locale, " Similarly, there are other reasons to separate language and formatting locale. Indian numbering system is preferred for Indian users, Thai calendar year is expected for Thai users, to name a few. In summary, local conventions should be honored, regardless of the application's UI language. (I don't mean to uphold the POSIX model. I advocate separating language and formatting locales and negotiating as many times as needed.) |
The position that this (english string, german date) is not universally agreed upon among this group. Please, don't present it as if it was. In particular, our experience at Mozilla (see my comment above with the list of bugs) is that this is a very fragile and subjective area of UX where we likely cannot design a "perfect solution for everyone", and multilingual users (not even in fallback scenarios) will strongly differ in their preferred outcome. To illustrate it using your example, the difference between March 6th 2020 and June 3rd 2020 can be impossible to deduct from the formatted string ("03/06/2020" vs "06/03/2020") and depend on the language/locale. In such case, users may attempt to deduct the element order from the surrounding string and misread the date. It happened to us in the context of error certificates UX. For that reason, I'd argue that at the edge there are two types of users - users who want Another area of possible confusion are unit names. I don't remember it off hand, but there are some cases where a unit symbol in one locale is the same as another unit's symbol in another. In such case, presenting And finally, short abbreviation of weekday names overlaps very often, so
Agree. Hence my suggestion to always carry fallback chains so that on each level negotiation can be performed.
Also agree. An issue we currently face at Mozilla is that our JS engine has one "default locale" which has to be used for date format and for pluralization, and we are working our way toward separating those for that reason.
Agree. I think this is an area of customization that we should aim to provide good-enough defaults, but recognize our inability to provide perfect defaults, and allow for alternative models via options. |
There's also the issue of websites presenting units based to local
custom based on location of the accessing device.
Weather forecast may be shown in Fahrenheit before going on a trip in
Europe, and in Celsius while there (same European website).
Can't remember whether that went with change in UI language for the
website or not.
|
I agree it's misleading. It's revised to "the German user may prefer: In some cases, it is desirable to use the same locale for both language and formatting locale, as Mihai raised in his example. On the other hand, there are other cases in which it is desirable to separate them and I would like this standard to support all common cases that are known to expect using the same locale at times and different locales in other times. I agree to provide good-enough defaults with flexibility to use alternates via options. In my understanding, desired units are generally deducible from the user's home locale. The application's context could call for a special handling, in which the default convention should be overridden. For instance, it may be appropriate for a mobile whether forecast app to show both Fahrenheit and Celsius if the current location's convention is different from the one the user is accustomed to. As Zibi noted, the desired behavior may depend on the surrounding elements. I think the API should allow the application to optionally specify fine elements of the user locale to cater for the personal preferences that could be different from the locale defaults. For good user experience it is very important to use the local conventions that the user is accustomed to. Respecting user's home timezone is particularly important (#25) because it is impossible to deduce it from the locale and it is often unacceptable to present a date or time in a wrong timezone. |
Apache MyFaces is an excellent example of Web application framework that meets this requirement. |
This appears like it might be duplicated by #426 ? |
Closing in favor of #426 |
Is your feature request related to a problem? Please describe.
Some message formatting APIs only allow you to make a single choice with respect to language/locale. This single choice is used for both string retrieval and for the formatting of any locale sensitive placeholders within those strings.
I'd like to see the ability to provide two independent language/locale choices to the API - one used for string retrieval, the other for locale sensitive formatting.
Describe the solution you'd like
API accepts UI language & formatting locale as independent variables.
Describe why your solution should shape the standard
Architecturally, UI string retrieval, and placeholder formatting are completely separate functions.
We should implement it that way. If API consumers do not want to expose this flexibility to their users - that's fine, they don't have to.
Where a locale formatting choice has not been provided, we can fallback to the language choice.
Additional context or examples
This may result in mixed language UIs - where a string is translated per the language choice, and say a date placeholder is formatted according to the locale choice.
That's fine - so long as the user's expectations are properly managed, and they are not surprised.
Most OSs allow for this type of separation.
The text was updated successfully, but these errors were encountered: