Skip to content

Commit d2f2a3b

Browse files
authored
Merge pull request MicrosoftDocs#3532 from corob-msft/docs/corob/issue-3100
Fix MicrosoftDocs/cpp-docs/issues/3100 `char` neither signed nor unsigned
2 parents 5366ab6 + 549d3e8 commit d2f2a3b

File tree

2 files changed

+11
-9
lines changed

2 files changed

+11
-9
lines changed
Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
description: "Learn more about: char, wchar_t, char16_t, char32_t"
3-
title: "char, wchar_t, char16_t, char32_t"
4-
ms.date: "02/14/2018"
2+
description: "Learn more about: char, wchar_t, char8_t, char16_t, char32_t"
3+
title: "char, wchar_t, char8_t, char16_t, char32_t"
4+
ms.date: 04/23/2021
55
ms.assetid: 6b33e9f5-455b-4e49-8f12-a150cbfe2e5b
66
---
7-
# char, wchar_t, char16_t, char32_t
7+
# char, wchar_t, char8_t, char16_t, char32_t
88

9-
The types **`char`**, **`wchar_t`**, **`char16_t`** and **`char32_t`** are built-in types that represent alphanumeric characters as well as non-alphanumeric glyphs and non-printing characters.
9+
The types **`char`**, **`wchar_t`**, **`char8_t`**, **`char16_t`**, and **`char32_t`** are built-in types that represent alphanumeric characters, non-alphanumeric glyphs, and non-printing characters.
1010

1111
## Syntax
1212

@@ -19,10 +19,12 @@ char32_t ch4{ U'a' };
1919
2020
## Remarks
2121
22-
The **`char`** type was the original character type in C and C++. The type **`unsigned char`** is often used to represent a *byte*, which is not a built-in type in C++. The **`char`** type can be used to store characters from the ASCII character set or any of the ISO-8859 character sets, and individual bytes of multi-byte characters such as Shift-JIS or the UTF-8 encoding of the Unicode character set. Strings of **`char`** type are referred to as *narrow* strings, even when used to encode multi-byte characters. In the Microsoft compiler, **`char`** is an 8-bit type.
22+
The **`char`** type was the original character type in C and C++. The **`char`** type can be used to store characters from the ASCII character set or any of the ISO-8859 character sets, and individual bytes of multi-byte characters such as Shift-JIS or the UTF-8 encoding of the Unicode character set. In the Microsoft compiler, **`char`** is an 8-bit type. It's a distinct type from both **`signed char`** and **`unsigned char`**. By default, variables of type **`char`** get promoted to **`int`** as if from type **`signed char`** unless the [`/J`](../build/reference/j-default-char-type-is-unsigned.md) compiler option is used. Under **`/J`**, they're treated as type **`unsigned char`** and get promoted to **`int`** without sign extension.
23+
24+
The type **`unsigned char`** is often used to represent a *byte*, which isn't a built-in type in C++.
2325
2426
The **`wchar_t`** type is an implementation-defined wide character type. In the Microsoft compiler, it represents a 16-bit wide character used to store Unicode encoded as UTF-16LE, the native character type on Windows operating systems. The wide character versions of the Universal C Runtime (UCRT) library functions use **`wchar_t`** and its pointer and array types as parameters and return values, as do the wide character versions of the native Windows API.
2527
26-
The **`char16_t`** and **`char32_t`** types represent 16-bit and 32-bit wide characters, respectively. Unicode encoded as UTF-16 can be stored in the **`char16_t`** type, and Unicode encoded as UTF-32 can be stored in the **`char32_t`** type. Strings of these types and **`wchar_t`** are all referred to as *wide* strings, though the term often refers specifically to strings of **`wchar_t`** type.
28+
The **`char8_t`**, **`char16_t`**, and **`char32_t`** types represent 8-bit, 16-bit, and 32-bit wide characters, respectively. (**`char8_t`** is new in C++20 and requires the [`/std:c++latest`](../build/reference/std-specify-language-standard-version.md) compiler option.) Unicode encoded as UTF-8 can be stored in the **`char8_t`** type. Strings of **`char8_t`** and **`char`** type are referred to as *narrow* strings, even when used to encode Unicode or multi-byte characters. Unicode encoded as UTF-16 can be stored in the **`char16_t`** type, and Unicode encoded as UTF-32 can be stored in the **`char32_t`** type. Strings of these types and **`wchar_t`** are all referred to as *wide* strings, though the term often refers specifically to strings of **`wchar_t`** type.
2729
28-
In the C++ standard library, the `basic_string` type is specialized for both narrow and wide strings. Use `std::string` when the characters are of type **`char`**, `std::u16string` when the characters are of type **`char16_t`**, `std::u32string` when the characters are of type **`char32_t`**, and `std::wstring` when the characters are of type **`wchar_t`**. Other types that represent text, including `std::stringstream` and `std::cout` have specializations for narrow and wide strings.
30+
In the C++ standard library, the `basic_string` type is specialized for both narrow and wide strings. Use `std::string` when the characters are of type **`char`**, `std::u8string` when the characters are of type **`char8_t`**, `std::u16string` when the characters are of type **`char16_t`**, `std::u32string` when the characters are of type **`char32_t`**, and `std::wstring` when the characters are of type **`wchar_t`**. Other types that represent text, including `std::stringstream` and `std::cout` have specializations for narrow and wide strings.

docs/cpp/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@
7373
href: ../cpp/false-cpp.md
7474
- name: true
7575
href: ../cpp/true-cpp.md
76-
- name: char, wchar_t, char16_t, char32_t
76+
- name: char, wchar_t, char8_t, char16_t, char32_t
7777
href: ../cpp/char-wchar-t-char16-t-char32-t.md
7878
- name: __int8, __int16, __int32, __int64
7979
href: ../cpp/int8-int16-int32-int64.md

0 commit comments

Comments
 (0)