-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
Symfony language code mangling makes it hard to reuse #2468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
if we want to maintain BC, we could add an option to determine if to return unix or w3c "formatted" language codes .. |
It is not really easy to generate W3C formatted language codes BTW unless you have a database of components. Some examples of language codes as per the W3C standards (directly from the page I linked)
The capitalization depends on which component that language code part is coming from and components can be removed. So you can only generate W3C language codes proper if you have a registry of which string belongs to which component. Drupal uses all lowercase components but otherwise conforms to the W3C standard. |
FYI ZF2 uses UNIX internally too. Seems natural to me. I'd agree more to lsmith77 this can be an option. |
Some competitive analysis of systems mostly focusing on application level solutions. Ones using "web language tags":
Using Unix language codes:
Using a different approach:
Does this make the picture cleaner? (I doubt). |
For ICU : http://source.icu-project.org/repos/icu/icu/trunk/source/data/locales/ , use by PHP Intl |
Yeah, looking at the link, it looks like ICU is using similar composition to W3C but with underscores. Eg. they use zh_Hant_HK, where W3C would say zh-Hant-HK and Symfony would say zh_HANT_HK (note case and underscore differences). |
And ICU don't care about separators, 👍 <?php
print_r(locale_parse('zh_Hant_HK'));
print_r(locale_parse('zh-Hant-HK'));
|
Two more data points from the two most popular mobile systems. All versions of iOS (and Mac OS X from Tiger (10.4) released 6 years ago) use BCP 47 language tags too (exactly as the W3C). http://developer.apple.com/library/ios/#documentation/MacOSX/Conceptual/BPInternational/Articles/LanguageDesignations.html#//apple_ref/doc/uid/20002144-SW3 Android is referring to ICU as their locale source, however they are using Unix language codes (eg. zh_CN and zh_TW vs. ICU's zh_Hant and zh_Hans): http://developer.android.com/reference/java/util/Locale.html |
stealth35: well, its not just an underscore problem. If zh-Hant is coming in, Symfony converts that to zh_HANT, but that is not a Unix language code. It should convert it to zh_TW if it would use Unix language codes, right? (Also looked at the list of languages supported by Ubuntu at https://translations.launchpad.net/ubuntu). |
I would go with ICU only as soon as possible. This is a standard used by many applications and won't change but to get new codes. intl is available since 5.3.0 by default and should be used for any work related to internationalization. (set|get)locale should be banned from any modern application, it is not portable, crashes more than it should and is per process instead of being per resource/object or at least per request. |
@goba for ICU |
stealth35: you are proposing Symfony include the alias resolving process in part of the normalization? Currently Symfony would give you zh_CN or zh_HANS_CN depending on incoming data and not treat them as equal. |
Few clarifications:
PHP's intl extension uses the ICU data. We've got pretty-much built in support for it, and I don't see why we should change (and break BC). This is an old issue and I'm gonna close it. I guess Drupal has solved this issue by now in their own way. If anyone's got more input we can always re-start the discussion. |
Symfony mangles language codes for its internal use to (a) uppercase all components but the first and (b) replace component separating dashes with underscores. This is maybe conforming to a standard or common practice, but the W3C never suggested people use language codes like that and their recent recommendations with HTML5 are clearly very far from how Symfony treats language codes.
There is certainly nothing wrong with picking a language code format and doing conversion on incoming data and outgoing data. We in Drupal try to avoid this by using a standard that is much closer to the W3C specs. We use all lowercase codes (which is not strictly W3C) and we use dashes (like the W3C). Now that Drupal 8 is adapting Symfony in certain places, we cannot reuse the language handling code at all since the language codes are so far from how the web expects them, and we don't want to convert back and forth if we can work with formats that are exactly or at least much more closely resemble web standards.
More information on language codes on the web: http://www.w3.org/International/articles/language-tags/
The text was updated successfully, but these errors were encountered: