Skip to content
This repository was archived by the owner on Jan 31, 2023. It is now read-only.

Commit 978e17c

Browse files
committed
Also link to Unicode Standard
1 parent 400ac75 commit 978e17c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ Encoding | Immediate value | Encoding unit
5555
[WTF-8](https://simonsapin.github.io/wtf-8/) | 0x0 | u8 / 8 bits
5656
[WTF-16](https://simonsapin.github.io/wtf-8/#wtf-16) | 0x1 | u16 / 16 bits
5757

58-
The WTF family of encodings has been chosen over the respective UTF family of encodings because it is more lenient, i.e. does not introduce trapping behavior but defers sanitization to modules and APIs requiring it. JavaScript and most of its related APIs are effectively designed for WTF-16, not UTF-16, for example. Or as the Unicode Standard, Version 13.0 states in section 2.7 Unicode Strings:
58+
The WTF family of encodings has been chosen over the respective UTF family of encodings because it is more lenient, i.e. does not introduce trapping behavior but defers sanitization to modules and APIs requiring it. JavaScript and most of its related APIs are effectively designed for WTF-16, not UTF-16, for example. Or as [The Unicode Standard, Version 13.0 – Core Specification](http://www.unicode.org/versions/Unicode13.0.0/ch02.pdf) states in section 2.7 Unicode Strings:
5959

6060
> Depending on the programming environment, a Unicode string may or may not be required to be in the corresponding Unicode encoding form. For example, strings in Java, C#, or ECMAScript are Unicode 16-bit strings, but are not necessarily well-formed UTF16 sequences. In normal processing, it can be far more efficient to allow such strings to contain code unit sequences that are not well-formed UTF-16—that is, isolated surrogates. Because strings are such a fundamental component of every program, checking for isolated surrogates in every operation that modifies strings can create significant overhead, especially because supplementary characters are extremely rare as a percentage of overall text in programs worldwide.
6161

0 commit comments

Comments
 (0)