Jump to content

Left-to-right mark: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Tags: Reverted Visual edit
m Reverted edits by 142.116.187.97 (talk) to last version by 49.198.51.54
Line 1: Line 1:
{{short description|Bidirectional control character}}
{{short description|Bidirectional control character}}
{{More citations needed|date=January 2019}}
{{More citations needed|date=January 2019}}
The '''left-to-right mark''' ('''LRM''') is a [[control character]] (an invisible formatting character) used in computerized [[typesetting]] (including [[word processor|word processing]] in a program like [[Microsoft Word]]). of text containing a mix of left-to-right scripts (such as [[Latin script|Latin]] and [[Cyrillic script|Cyrillic]]) and right-to-left scripts (such as [[Arabic script|Arabic]], [[Syriac alphabet|Syriac]], and [[Hebrew alphabet|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction.
The '''left-to-right mark''' ('''LRM''') is a [[control character]] (an invisible formatting character) used in computerized [[typesetting]] (including [[word processor|word processing]] in a program like [[Microsoft Word]]) of text containing a mix of left-to-right scripts (such as [[Latin script|Latin]] and [[Cyrillic script|Cyrillic]]) and right-to-left scripts (such as [[Arabic script|Arabic]], [[Syriac alphabet|Syriac]], and [[Hebrew alphabet|Hebrew]]). It is used to set the way adjacent characters are grouped with respect to text direction.


== Unicode ==
== Unicode ==
Line 8: Line 8:


==Example of use in HTML==
==Example of use in HTML==
Suppose the writer wishes to use some English text (a left-to-right script) in a paragraph written in Arabic or Hebrew (a right-to-left script) that includes symbol characters to the right of an alphabetic letter. For example, the Arabic for “The language C++ is a programming language used... contains the English character string “C++”. Without an LRM control character, it looks like this:
Suppose the writer wishes to use some English text (a left-to-right script) into a paragraph written in Arabic or Hebrew (a right-to-left script) with non-alphabetic characters to the right of the English text. For example, the writer wants to translate, "The language C++ is a programming language used..." into Arabic. Without an LRM control character, the result looks like this:


<span dir="rtl">لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...</span>
<span dir="rtl">لغة C<span style="color:red">++</span> هي لغة برمجة تستخدم...</span>

Revision as of 15:27, 4 November 2022

The left-to-right mark (LRM) is a control character (an invisible formatting character) used in computerized typesetting (including word processing in a program like Microsoft Word) of text containing a mix of left-to-right scripts (such as Latin and Cyrillic) and right-to-left scripts (such as Arabic, Syriac, and Hebrew). It is used to set the way adjacent characters are grouped with respect to text direction.

Unicode

In Unicode, the LRM character is encoded at U+200E LEFT-TO-RIGHT MARK (&lrm;). In UTF-8 it is E2 80 8E. Usage is prescribed in the Unicode Bidi (bidirectional) algorithm.[1]

Example of use in HTML

Suppose the writer wishes to use some English text (a left-to-right script) into a paragraph written in Arabic or Hebrew (a right-to-left script) with non-alphabetic characters to the right of the English text. For example, the writer wants to translate, "The language C++ is a programming language used..." into Arabic. Without an LRM control character, the result looks like this:

لغة C++ هي لغة برمجة تستخدم...

With an LRM entered in the HTML after the ++, it looks like this, as the writer intends:

لغة C++‎ هي لغة برمجة تستخدم...

In the first example, without an LRM control character, a web browser will render the ++ on the left of the "C" because the browser recognizes that the paragraph is in a right-to-left text (Arabic) and applies punctuation, which is neutral as to its direction, according to the direction of the adjacent text. The LRM control character causes the punctuation to be adjacent to only left-to-right text – the "C" and the LRM – and position as if it were in left-to-right text, i.e., to the right of the preceding text.

Some software requires using the HTML code &#8206; or &lrm; instead of the invisible Unicode control character itself [citation needed]. Using the invisible control character directly could also make copy editing difficult.

See also

References