Skip to content

Allow options on closing markup placeholders #582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 13, 2024
Merged

Conversation

stasm
Copy link
Collaborator

@stasm stasm commented Jan 8, 2024

As a follow-up to #541 (review), we'd like to allow options on "close" markup placeholders.

The main use-case is tooling and XLIFF interchange. See also #450 which proposes a different mechanism for this (but hasn't yet been accepted).

@stasm stasm requested review from aphillips and eemeli January 8, 2024 16:50
Copy link
Collaborator

@eemeli eemeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not find the use case for this compelling.

Having written and maintaining a potential MF2-XLIFF2 bidirectional conversion, I find the <ec startRef> to be plenty sufficient to not introduce a need for MF2 to include a feature like this, especially as the baskets of options used by MF2 will in any case need to be maintained separately from the <sc>/<ec> pair, or a similar structure in other message formatting languages.

I believe that if we later do find from deeper work in such conversion tooling that we need something like this in MF2, then that conversation can and should be informed by example messages through which to consider this.

@aphillips
Copy link
Member

@eemeli I think a question might be: do we have sufficient reason to prohibit it? Instead we might allow implementations to prohibit it or specific markup regimes to prohibit it (where "it" == options on close).

@aphillips aphillips added the LDML45 LDML45 Release (Tech Preview) label Jan 8, 2024
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tweaks.

@mihnita
Copy link
Collaborator

mihnita commented Jan 29, 2024

Some use cases for closing markup placeholder

In general anything that allows "incorrect nesting" (which might not be incorrect in the given platform), and use the same tag name. Similar to <span> in HTML, than can be used to tag things with styles.


A real example would be certain Android spans.
StyleSpan is used to represent bold, italic, and bold_italic.
And since these are annotated ranges, not really open-close tags, something like this is perfectly valid: ...<b>...<i>...</b>...</i>...

Similar to this in HTML:
...<span class='foo'>....<span class='bar'>...</span>...</span>...

In HTML I have no way to say that the first </span> closes the foo span.
I have to refactor it in something like:
...<span class='foo'>....</span><span class='foo bar'>...</span><span class='bar'>...</span>...

Sure, one can say: then refactor the Android code to do the same.
But this is unfriendly.
The Android spans are legal as they are.
We should not say: you can't do this in Android / spans because it is illegal in HTML.

This also happens in the iOS / macOS CFAttributedString


Another use case would be ANSI escapes.
For example (the string is inside the double quotes, but I am making it easier to copy/paste in terminal):

echo -e "\e[93m this is yellow \e[46m cyan background \e[1;3;4m bold and italic and underline \e[39m default foreground \e[24m end of underline \e[m reset all attributes"

Not only there is no requirement for correct nesting, but some "closing" escapes (like the reset one, \e[m) can close several open markers. Or the end-of-underline (24) closing part of the open bold-italic-underline (1;3;4)

So one might want to encode some info about this, some grouping tag, that informs how a remove or a clone might work.

@mihnita
Copy link
Collaborator

mihnita commented Jan 29, 2024

With a great sample of two :-), there are also people questioning the HTML decision:
https://stackoverflow.com/questions/4138006/can-i-have-attributes-on-closing-tags

There is the user asking the questions, and the last one, saying "from this HTML end-user's view it seems absolutely absurd that a closing DIV can't be easily identified" (that is how I counted 2)

And most people there said no, and offer workarounds. I didn't see one saying "what you are asking is absurd"

Also
https://stackoverflow.com/questions/3599936/attributes-in-elements-closing-tag
https://stackoverflow.com/questions/8289137/html-id-attribute-in-close-tag
https://stackoverflow.com/questions/8887015/closing-tag-with-id-property

@eemeli
Copy link
Collaborator

eemeli commented Feb 1, 2024

A real example would be certain Android spans. StyleSpan is used to represent bold, italic, and bold_italic. And since these are annotated ranges, not really open-close tags, something like this is perfectly valid: ...<b>...<i>...</b>...</i>...

A message using such StyleSpans could easily be represented in MF2 as:

...{#StyleSpan:BOLD}...{#StyleSpan:ITALIC}...{/StyleSpan:BOLD}...{/StyleSpan:ITALIC}...

i.e. quite similarly to how it's represented in your example HTML.

Another use case would be ANSI escapes. For example (the string is inside the double quotes, but I am making it easier to copy/paste in terminal):

echo -e "\e[93m this is yellow \e[46m cyan background \e[1;3;4m bold and italic and underline \e[39m default foreground \e[24m end of underline \e[m reset all attributes"

I would represent this in MF2 as (hard-wrapped for legibility):

{#ansi:fg color=yellow style=bright} this is yellow {#ansi:bg color=cyan} cyan background
{#ansi:bold}{#ansi:italic}{#ansi:underline} bold and italic and underline
{#ansi:fg color=default} default foreground {/ansi:underline} end of underline
{/ansi:reset} reset all attributes

I agree the that the above markup doesn't nest like HTML or XML, but I don't see a need here for closing-markup options? The reset in particular resets everything. Alternatively, it's also possible to express this with HTML-like nesting, provided that we assume the message's formatting to start in a reset state:

{#ansi:fg color=yellow style=bright} this is yellow {#ansi:bg color=cyan} cyan background
{#ansi:bold}{#ansi:italic}{#ansi:underline} bold and italic and underline
{/ansi:fg} default foreground {/ansi:underline} end of underline
{/ansi:italic}{/ansi:bold}{/ansi:bg} reset all attributes

@mihnita
Copy link
Collaborator

mihnita commented Feb 5, 2024

A message using such StyleSpans could easily be represented in MF2 as:

Think HTML <span class='foo'> vs <span class='bar'>
This span has an unlimited number of values for "span"
Can't cover that in registry.

@stasm
Copy link
Collaborator Author

stasm commented Feb 5, 2024

A real example would be certain Android spans. StyleSpan is used to represent bold, italic, and bold_italic. And since these are annotated ranges, not really open-close tags, something like this is perfectly valid: ...<b>...<i>...</b>...</i>...

A message using such StyleSpans could easily be represented in MF2 as:

...{#StyleSpan:BOLD}...{#StyleSpan:ITALIC}...{/StyleSpan:BOLD}...{/StyleSpan:ITALIC}...

But this hardcodes additional information into the tag name. Instead, consider:

...{#StyleSpan id=1}...{#StyleSpan id=2}...{/StyleSpan id=1}...{/StyleSpan id=2}...

in which we only encode the identifier of the spannable, and then let the callsite add concrete formatting attributes (possibly more than one) to each of them, by id.

@eemeli
Copy link
Collaborator

eemeli commented Feb 5, 2024

But this hardcodes additional information into the tag name.

Yes, matching the small set of StyleSpan styles:

Possible styles are: Typeface#NORMAL, Typeface#BOLD, Typeface#ITALIC and Typeface#BOLD_ITALIC.

Is there any Android span that might make sense to have overlapping ranges, and which does not have such a hardcoded short list of variants? That might make this a more convincing example.

Instead, consider:

...{#StyleSpan id=1}...{#StyleSpan id=2}...{/StyleSpan id=1}...{/StyleSpan id=2}...

in which we only encode the identifier of the spannable, and then let the callsite add concrete formatting attributes (possibly more than one) to each of them, by id.

While I continue to not be convinced here, I would like to note that as far as I can tell, the only argument being made for options on markup-close is an equivalent of the XLIFF 2 <ec startRef>. That sounds to me more like a use case for an @id attribute than an id option.

Are there any other potential use cases for options on markup-close that should be accounted for?

@mihnita
Copy link
Collaborator

mihnita commented Feb 7, 2024

Is there any Android span that might make sense to have overlapping ranges, and which does not have such a hardcoded short list of variants? That might make this a more convincing example.

Of course.
For example TextAppearanceSpan

See here for complete list (click "and 10 others" for a summary table):
https://developer.android.com/reference/android/text/ParcelableSpan

There are many in that list with "an unlimited number of options", but TextAppearanceSpan is probably the closest one to the <span> in HTML.

And one can implement its own spans: https://developer.android.com/develop/ui/views/text-and-emoji/spans#custom-spans


iOS and macos have similar attributes, including the ability to create custom ones.


That sounds to me more like a use case for an @id attribute than an id option.

Potatoes / potato ;-)
I don't think there is a fundamental difference between the two.
Except that @ attributes are "universal", don't have to be registered for a certain function, they work everywhere.

One can register markup in the html namespace that behave exactly like html. And say that the closing tag takes no parameters.
It does not impede on anything.
Allowing for @id on closing of such markup makes things less like html.


That sounds to me more like a use case for an @id attribute than an id option.

And also, note the ANSI escapes.


I think we already have some decent use cases.
Compelling for some, maybe not for others, that's ok.

Our lack of imagination should not prevent people from doing what they need, now or 5 years from now.
Allowing these attributes does not prevent one from creating markup that does not take such attributes.

I think that in general our philosophical position was to not take away freedoms unless we have reasonable arguments that allowing something is bad i18n.

@aphillips aphillips merged commit e55f91e into main Feb 13, 2024
@aphillips aphillips deleted the markup-close-options branch February 13, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LDML45 LDML45 Release (Tech Preview)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants