Jump to content

Left-to-right mark: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 6: Line 6:


==Example of use in HTML==
==Example of use in HTML==
Suppose the writer wishes to inject a run of English text (i.e. left-to-right) text into an Arabic or Hebrew paragraph, with non-alphabetic characters at the end of the English text (on the right). "The language C++ is a programming language used..." in Arabic, but with the "C++" in English renders as follows:
Suppose the writer wishes to use some English text (a left-to-right text) into a paragraph written in Arabic or Hebrew (a right-to-left text) with non-alphabetic characters to the right of the English text. For example, the writer wants to translate, "The language C++ is a programming language used..." into Arabic. Without an LRM control character, the result looks like this:


‫ لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...
‫ لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...


With an LRM entered in the HTML after the ++, it renders as follows:
With an LRM entered in the HTML after the ++, it looks like this, as the writer intends:


‫ لغة C<span style="color:red">++</span>&lrm; هي لغة برمجة تستخدم...
‫ لغة C<span style="color:red">++</span>&lrm; هي لغة برمجة تستخدم...


Standards-compliant browsers will render the ++ on the left in the first example, and on the right in the second. This happens because the browser recognizes that the paragraph is in a right-to-left script ([[Arabic script|Arabic]]) and applies punctuation, which is neutral as to its direction, in coordination with the adjacent text. The LRM causes the punctuation to be adjacent to only left-to-right text – the "C" and the LRM – and hence position as if it were in left-to-right text, i.e., to the right of the preceding text. <code>&amp;#8206;</code> or <code>&amp;lrm;</code> may be required by some software rather than the invisible Unicode character itself; the actual invisible character could also make copy editing difficult.
In the first example, without an LRM control character, a [[web browser]] will render the ++ on the left of the "C" because the browser recognizes that the paragraph is in a right-to-left text ([[Arabic script|Arabic]]) and applies punctuation, which is neutral as to its direction, according to the direction of the adjacent text. The LRM control character causes the punctuation to be adjacent to only left-to-right text – the "C" and the LRM – and position as if it were in left-to-right text, i.e., to the right of the preceding text.
Some software requires using the [[HTML]] code <code>&amp;#8206;</code> or <code>&amp;lrm;</code> instead of the invisible Unicode control character itself. Using the invisible control character directly could also make copy editing difficult.


==See also==
==See also==

Revision as of 19:17, 20 May 2015

The left-to-right mark (LRM) is a control character (an invisible formatting character) used in computerized typesetting (including word processing in a program like Microsoft Word) of text that contains a mixture of left-to-right text (such as English and Russian) and right-to-left text (such as Arabic, Persian and Hebrew). It is used to set the way adjacent characters are grouped with respect to text direction.

Unicode

In Unicode, LRM is encoded U+200E LEFT-TO-RIGHT MARK (&lrm;). UTF-8 is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.

Example of use in HTML

Suppose the writer wishes to use some English text (a left-to-right text) into a paragraph written in Arabic or Hebrew (a right-to-left text) with non-alphabetic characters to the right of the English text. For example, the writer wants to translate, "The language C++ is a programming language used..." into Arabic. Without an LRM control character, the result looks like this:

‫ لغة C++ هي لغة برمجة تستخدم...

With an LRM entered in the HTML after the ++, it looks like this, as the writer intends:

‫ لغة C++‎ هي لغة برمجة تستخدم...

In the first example, without an LRM control character, a web browser will render the ++ on the left of the "C" because the browser recognizes that the paragraph is in a right-to-left text (Arabic) and applies punctuation, which is neutral as to its direction, according to the direction of the adjacent text. The LRM control character causes the punctuation to be adjacent to only left-to-right text – the "C" and the LRM – and position as if it were in left-to-right text, i.e., to the right of the preceding text.

Some software requires using the HTML code &#8206; or &lrm; instead of the invisible Unicode control character itself. Using the invisible control character directly could also make copy editing difficult.

See also