Skip to content

Add :date and :time functions #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 19, 2024
Merged

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Dec 17, 2023

Add :date and :time functions aliases for the :datetime formatter, and map their style options correspondingly (requires a new <mapOption> element).

At the moment, to format a time, we require an expression

{$x :datetime timeStyle=short}

With the changes here, we introduce

{$x :time}

as a synonym for the above, and allow e.g.

{$x :time style=long}

as synonymous for

{$x :datetime timeStyle=long}

The :date function has a corresponding definition.

These functions provide three core benefits:

  1. Clarify compared to :datetime whether the placeholder formats only a date, a time, or possibly both.
  2. Simplify the options presented for most date and time formatting.
  3. For the common cases, shorten expressions by 20 characters.

Edit: Refactored as functions instead of aliases.

@eemeli eemeli added the functions Issue pertains to the default function set label Dec 17, 2023
@eemeli eemeli requested review from aphillips and stasm December 17, 2023 10:50
@stasm
Copy link
Collaborator

stasm commented Dec 17, 2023

I'm not a fan of extending the default set of formatters too much too early. Could we reliably add these aliases in 2.x?

@stasm
Copy link
Collaborator

stasm commented Dec 17, 2023

At the moment, to format a time, we require an expression

{$x :datetime timeStyle=short}

With the changes here, we introduce

{$x :time}

I don't see anything wrong with the first example. It's clear and consistent with the rest of the registry. In fact, the addition of mapOption increases the complexity of the registry definitions by effectively denormalizing the data to allow {$x :time style=long}.

@macchiati
Copy link
Member

macchiati commented Dec 17, 2023 via email

@aphillips
Copy link
Member

@stasm noted:

In fact, the addition of mapOption increases the complexity of the registry definitions by effectively denormalizing the data to allow {$x :time style=long}.

I agree that it does. I think @eemeli is right not to reuse dateStyle and timeStyle though. One weirdness of MF1 is that you can say {d,date,::jm} (it's formatting a time) or {t,time,::yMMMd} (it's formatting a date). Technically you could do the same thing with :date and :time, but you have to use the big bag of options (or :icu:skeleton 😍) to get to that.

I think it is also a useful test case for the registry's syntax. I don't mind the right level of complexity in the registry, if the result is a set of formatting and selector functions that feel natural and intuitive to users writing messages. I think that's the most important thing here.

@macchiati noted:

I agree with Stanisław; this is not necessary, and could be added later if needed.

I kind of disagree, or, more specifically... I agree that this is not required in order for us to ship. However, we have a good bit of experience with date and number formatting and we want our syntax to be usable and attractive.

In MF1's syntax, the {d, date} and {t,time} formats allow easy access to just-date and just-time formats. A date-time format, however, requires a picture string (yuck) or skeleton (ICU only). In MF2's syntax, :datetime provides date-only by default and requires one option to get only a time (to style the time and imply suppressing the date) and two options to get both a date and time (assuming that :datetime works exactly like Intl.DateTimeFormat).

{$d :datetime} => 12/17/2023
{$d :datetime timeStyle=short} => 10:44 a.m.
{$d :datetime dateStyle=short timeStyle=short} => 12/17/2023 10:44 a.m.

Dates by themselves and times by themselves are not uncommon. Shorthands for both will be useful. Wouldn't you rather start here:

{$d :datetime} => 12/17/2023 10:44 a.m.
{$d :date} => 12/17/2023
{$d :time} => 10:44 a.m.

To be honest, I adore date skeletons (to the point of telling people that they should always use them), but imposing support for them we already decided as a working group is a bridge too far.


I semi-agree with @eemeli's comment on #558 that we need to have a specific discussion about the default registry. We need to focus on the 2.0 registry just now and that's why I created #564--so that we can discuss the default registry "all up" rather than piecemeal.

If we need extra calls or "fast track" design documents to accomplish this, I'm all for it. I particularly think folks should write a bunch of messages to get some "finger feel" for it. I also think this PR might be premature vs. having a design doc for date/time.

Finally, I am concerned that we also do some forward looking thinking about functions that won't make the cut for 2.0 but which we don't want people to run off and innovate different syntaxes for. Skeletons is one. The flock of time and measurement related functions in ICU is another. And personal names. I think post-2.0 we should have a shadow registry of candidates for extensions to help folks do the right things there.

@mihnita
Copy link
Collaborator

mihnita commented Dec 18, 2023

In MF2's syntax, :datetime provides date-only by default
{$d :datetime} => 12/17/2023

I don't think that's specified anywhere.

One weirdness of MF1 is that you can say {d,date,::jm} (it's formatting a time) or {t,time,::yMMMd} (it's formatting a date).
And it comes from the fact that it was designed with date + style, or time + style in mind.
The skeleton came later, and broke that.
But it was broken before: you could specify a pattern, and it was not possible to specify date + time. It was either date, or time.

So I think adding aliases here is not a good idea.
What happens if I say {$foo :time year=numeric month=full}? Error?
We only introduce complexity to save typing a few characters.

@aphillips
Copy link
Member

@mihnita noted:

What happens if I say {$foo :time year=numeric month=full}? Error?
We only introduce complexity to save typing a few characters.

I think the same thing happens as in MF1. The function names date and time are just gateways to DateFormat, so the options override the default meaning of the keywords.

I don't think that's specified anywhere.

It is in Intl.DateTimeFormat. We don't have to follow them, but we probably should specify what happens by default. If we don't, messages will do different things in different runtimes (== developers have to keep track of what their local implementation does, which is bad... which is why we're having this conversation 😈 )

@eemeli eemeli changed the title Add :date and :time aliases Add :date and :time functions Jan 15, 2024
Comment on lines 205 to 209
<option name="hourCycle" values="h11 h12 h23 h24">
<description>
The hour cycle to use.
</description>
</option>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be hour12 not hourCycle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

FWIW, I've always found 402's option name hour12 to be weird. It only describes one or maybe two of the options (and the more common override direction is from 12 to 24 hour)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hour12 sounds like a boolean.

CLDR and the -u- extension have 4 possible values: h11, h12, h23, h24.
So hour12 as the name of the option is problematic (can I do hour12=h11 ?)

Also hourCycle can be "default" or "auto" (meaning the default for the given locale).
Which again don't work well with a boolean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hour12 means you want AM/PM or not, but it doesn't express a preference between h11/h12 or between h23/h24, instead leaving that decision up to the locale. It could be the case that a locale prefers h11 for its AM/PM format, but it defaults to h23, for example.

Copy link
Member

@aphillips aphillips Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hourCycle is an override option. The default is auto but there is no need to specify auto. You only need the cycle in order to change it.

I could see adding hour12 or hour24 as a shorthand to coerce locales from 11/12to use23/24. This has a nice "skeleton-y" quality about it, since it doesn't require users to pass a _specific_ hc` when it should partly depend on the locale when coercing the other behavior.

Comment on lines 152 to 160
<option name="calendar" values="buddhist chinese coptic dangi ethioaa ethiopic gregory hebrew indian islamic islamic-umalqura islamic-tbla islamic-civil islamic-rgsa iso8601 japanese persian roc">
<description>
Calendar to use.
</description>
</option>
<option name="numberingSystem" values="arab arabext bali beng deva fullwide gujr guru hanidec khmr knda laoo latn limb mlym mong mymr orya tamldec telu thai tibt">
<description>
Numbering system to use.
</description>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the numbering system and calendar are useful configurations, I think they should come from the locale and not from the API options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbering system and the calendar come from the locale (almost all of the settings for date/time come from the locale by default). These are override options.


<formatSignature>
<input validationRule="iso8601"/>
<option name="style" values="full long medium short">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you going to describe in any amount of detail what the difference is between these four styles?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. 😉 🙈

This is what CLDR is for :-)

On a higher level, we do not prescribe any output from any function. An implementation is free to implement options in anyway that it sees fit, including doing nothing at all. They just have to accept all of the options and values that they are required to accept by the spec.

They can also extend the options as they see fit (we recommend using namespaced options when doing so).

User will complain if the options don't work or work badly, but that's a problem for the implementer, not MFv2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This general strategy seems problematic. It means that whatever ICU implements will become the de-facto standard behavior. If ICU4X or ECMA or Ruby or Scala implement a different semantic meaning for the same formatting options, users will file bugs that their behavior is different than on the canonical ICU implementation, even if those other implementations are spec-compliant. It is the spec's role to prevent semantic differences between implementations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This general strategy already exists. FWIW, different versions of CLDR change what these shorthands mean or what specific data values are. Different locales handle these differently. CLDR changes formatting details all the time.

I tend to tell developers not to worry about--or depend on--the specific output in any specific locale.

But this is one reason why I love date skeletons and actively told users for the past five or eight years never to use the shorthand keywords. Tell the system what you want and DateTimePatternGenerator will build you the appropriate pattern in every locale. The translators don't have to mess with it and you'll never have that weird bug in some language in which short has some non-numeric gunk in it or whatever. I would submit that a "medium" date has no semantic meaning (the fact that you want me to describe it says that to be true!), while skeleton=EEEyyyyMMMd (<= not meant to be a medium date, eh?) says what I want and the results will contain no surprises.

I think I read into a lot of your comments this week a disconnect between MFWG's approach to formats and your expectations. MF2 does not implement any formatting at all. It delegates the formatting. Our options are user-facing and exist to let developers/translators communicate with the runtime what their formatting needs are. I hear the number or date formatter implementer in you--and value those comments hugely for keeping us honest. Many of our implementers, though, won't own the Intl equivalent code. They'll call platform I18N APIs using the options we provide (appropriately mapped) and their implementations should not require them to mirror ICU76 or a specific Intl version exactly. They can depend on system APIs for numbers and dates, at least mostly, I hope.

So... in this case, the keywords are widely recognized and CLDR even provides for them. But if Rust or Ruby want to impute different meanings or slightly different implementations, again, user's will complain if the results are wacky or wholly unexpected--or if they like the innovation, they'll complain to Annemarie et al about CLDR! Even CLDR has bugs in these cases.

Copy link
Member

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^

@aphillips aphillips added the LDML45 LDML45 Release (Tech Preview) label Feb 15, 2024
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per design doc accepted today 2024-02-15, needs to remove a bunch of options.

@eemeli eemeli requested review from aphillips and sffc February 16, 2024 19:27
Comment on lines +11 to +14
<validationRule id="xmlDate"
regex="-?([1-9][0-9]{3,}|0[0-9]{3})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(Z|[+-]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))?"/>
<validationRule id="xmlTime"
regex="(([01][0-9]|2[0-3])(:[0-5][0-9]){2}(\.[0-9]+)?|24:00:00(\.0+)?)(Z|[+-]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))?"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing at least one production here, unless you think iso8601 covers that? (using a different syntax??)

Is there some reason we can't reference XMLSchema directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you're referring to the :datetime operand, changing its validation rule here would be scope creep.

To refer to external specs, we need language for how to do that, and that's not currently supported by registry.dtd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. xsd recommends itself, except I'm not sure how compatible the number types are with JSON number (in fact, I'm sure they aren't compatible)

I suspect we should make the registry's format completely unpromised in Tech Preview--even the use of XML. The normative bits are the default registry prose right now anyway. Let's solve this good and proper--and not over the weekend. I don't think anyone's implementation depends on the XML file now. We shouldn't encourage anyone to depend on it being any particular way in the future.

Thoughts?

Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but look at linking schema instead of trying to profile it.

@macchiati
Copy link
Member

macchiati commented Feb 16, 2024 via email

@macchiati
Copy link
Member

macchiati commented Feb 17, 2024 via email

@aphillips aphillips merged commit bb93fa6 into unicode-org:main Feb 19, 2024
@eemeli eemeli deleted the date-time-aliases branch February 19, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions Issue pertains to the default function set LDML45 LDML45 Release (Tech Preview)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants