diff --git a/book/en-us/01-intro.md b/book/en-us/01-intro.md index 0b203dda..03ecd88c 100644 --- a/book/en-us/01-intro.md +++ b/book/en-us/01-intro.md @@ -43,7 +43,7 @@ Before learning modern C++, let's take a look at the main features that have dep - **C language style type conversion is deprecated (ie using `(convert_type)`) before variables, and `static_cast`, `reinterpret_cast`, `const_cast` should be used for type conversion.** -- **In particular, some of the C standard libraries that can be used are deprecated in the latest C++17 standard, such as ``, ``, `` and `` Wait** +- **In particular, some of the C standard libraries that can be used are deprecated in the latest C++17 standard, such as ``, ``, `` and `` etc.** - ... and many more diff --git a/book/en-us/02-usability.md b/book/en-us/02-usability.md index ee908378..eb8a21aa 100644 --- a/book/en-us/02-usability.md +++ b/book/en-us/02-usability.md @@ -18,20 +18,21 @@ which refers to the language behavior that occurred before the runtime. ### nullptr -The purpose of `nullptr` appears to replace `NULL`. In a sense, -traditional C++ treats `NULL` and `0` as the same thing, -depending on how the compiler defines NULL, -and some compilers define NULL as `((void*)0)` Some will define it directly as `0`. +The purpose of `nullptr` appears to replace `NULL`. There are **null pointer constants** in the C and C++ languages, +which can be implicitly converted to null pointer value of any pointer type, +or null member pointer value of any pointer-to-member type in C++. +`NULL` is provided by the standard library implementation and defined as an implementation-defined null pointer constant. +In C, some standard libraries defines `NULL` as `((void*)0)` and some define it as `0`. -C++ **does not allow** to implicitly convert `void *` to other types. -But if the compiler tries to define `NULL` as `((void*)0)`, then in the following code: +C++ **does not allow** to implicitly convert `void *` to other types, and thus `((void*)0)` is not a valid implementation +of `NULL`. If the standard library tries to define `NULL` as `((void*)0)`, then compilation error would occur in the following code: ```cpp char *ch = NULL; ``` C++ without the `void *` implicit conversion has to define `NULL` as `0`. -This still creates a new problem. Defining `NULL` to 0 will cause the overloading feature in `C++` to be confusing. +This still creates a new problem. Defining `NULL` to `0` will cause the overloading feature in `C++` to be confusing. Consider the following two `foo` functions: ```cpp @@ -41,7 +42,7 @@ void foo(int); Then the `foo(NULL);` statement will call `foo(int)`, which will cause the code to be counterintuitive. -To solve this problem, C++11 introduced the `nullptr` keyword, which is specifically used to distinguish null pointers, 0. The type of `nullptr` is `nullptr_t`, which can be implicitly converted to any pointer or member pointer type, and can be compared equally or unequally with them. +To solve this problem, C++11 introduced the `nullptr` keyword, which is specifically used to distinguish null pointers, `0`. The type of `nullptr` is `nullptr_t`, which can be implicitly converted to any pointer or member pointer type, and can be compared equally or unequally with them. You can try to compile the following code using clang++: @@ -153,7 +154,7 @@ actually returns a constant at runtime, which causes illegal production. C++11 provides `constexpr` to let the user explicitly declare that the function or object constructor will become a constant expression at compile time. This keyword explicitly tells the compiler that it should verify that `len_foo` -should be a compile-time constant expression. Constant expression. +should be a compile-time constant expression. In addition, the function of `constexpr` can use recursion: @@ -214,14 +215,15 @@ int main() { } // should output: 1, 4, 3, 4. can be simplified using `auto` - for (std::vector::iterator element = vec.begin(); element != vec.end(); ++element) + for (std::vector::iterator element = vec.begin(); element != vec.end(); + ++element) std::cout << *element << std::endl; } ``` In the above code, we can see that the `itr` variable is defined in the scope of the entire `main()`, which causes us to rename the other when a variable need to traverse -the entire `std::vectors` again. C++17 eliminates this limitation so that +the entire `std::vector` again. C++17 eliminates this limitation so that we can do this in if(or switch): ```cpp @@ -285,6 +287,8 @@ such as: ```cpp #include #include +#include + class MagicFoo { public: std::vector vec; @@ -299,7 +303,8 @@ int main() { MagicFoo magicFoo = {1, 2, 3, 4, 5}; std::cout << "magicFoo: "; - for (std::vector::iterator it = magicFoo.vec.begin(); it != magicFoo.vec.end(); ++it) + for (std::vector::iterator it = magicFoo.vec.begin(); + it != magicFoo.vec.end(); ++it) std::cout << *it << std::endl; } ``` @@ -374,8 +379,8 @@ One of the most common and notable examples of type derivation using `auto` is t ```cpp // before C++11 // cbegin() returns vector::const_iterator -// and therefore itr is type vector::const_iterator -for(vector::const_iterator it = vec.cbegin(); itr != vec.cend(); ++it) +// and therefore it is type vector::const_iterator +for(vector::const_iterator it = vec.cbegin(); it != vec.cend(); ++it) ``` When we have `auto`: @@ -413,17 +418,23 @@ auto i = 5; // i as int auto arr = new auto(10); // arr as int * ``` -Since C++ 20, `auto` can even be used as function arguments. Consider -the following example: +Since C++ 14, `auto` can even be used as function arguments in generic lambda expressions, +and such functionality is generalized to normal functions in C++ 20. +Consider the following example: ```cpp -int add(auto x, auto y) { +auto add14 = [](auto x, auto y) -> int { + return x+y; +} + +int add20(auto x, auto y) { return x+y; } auto i = 5; // type int auto j = 6; // type int -std::cout << add(i, j) << std::endl; +std::cout << add14(i, j) << std::endl; +std::cout << add20(i, j) << std::endl; ``` > **Note**: `auto` cannot be used to derive array types yet: @@ -476,7 +487,7 @@ type z == type x ### tail type inference -You may think that when we introduce `auto`, we have already mentioned that `auto` cannot be used for function arguments for type derivation. Can `auto` be used to derive the return type of a function? Still consider an example of an add function, which we have to write in traditional C++: +You may think that whether `auto` can be used to deduce the return type of a function. Still consider an example of an add function, which we have to write in traditional C++: ```cpp template @@ -544,7 +555,7 @@ std::cout << "q: " << q << std::endl; > To understand it you need to know the concept of parameter forwarding > in C++, which we will cover in detail in the -> [Language Runtime Hardening](./03-runtime.md) chapter, +> [Language Runtime Enhancements](./03-runtime.md) chapter, > and you can come back to the contents of this section later. In simple terms, `decltype(auto)` is mainly used to derive @@ -967,7 +978,7 @@ C++11 introduces the two keywords `override` and `final` to prevent this from ha ### override -When overriding a virtual function, introducing the `override` keyword will explicitly tell the compiler to overload, and the compiler will check if the base function has such a virtual function, otherwise it will not compile: +When overriding a virtual function, introducing the `override` keyword will explicitly tell the compiler to overload, and the compiler will check if the base function has such a virtual function with consistent function signature, otherwise it will not compile: ```cpp struct Base { diff --git a/book/en-us/03-runtime.md b/book/en-us/03-runtime.md index d75f1bc4..8ecd2956 100644 --- a/book/en-us/03-runtime.md +++ b/book/en-us/03-runtime.md @@ -87,8 +87,8 @@ capture lists can be: - \[\] empty capture list - \[name1, name2, ...\] captures a series of variables -- \[&\] reference capture, let the compiler deduce the reference list by itself -- \[=\] value capture, let the compiler deduce the value list by itself +- \[&\] reference capture, determine the reference capture list from the uses the in function body +- \[=\] value capture, determine the value capture list from the uses in the function body #### 4. Expression capture @@ -126,13 +126,9 @@ initialize it in the expression. In the previous section, we mentioned that the `auto` keyword cannot be used in the parameter list because it would conflict with the functionality of the template. -But Lambda expressions are not ordinary functions, so Lambda expressions are not templated. -This has caused us some trouble: the parameter table cannot be generalized, -and the parameter table type must be clarified. - -Fortunately, this trouble only exists in C++11, starting with C++14. -The formal parameters of the Lambda function can use the `auto` keyword -to generate generic meanings: +But lambda expressions are not regular functions, without further specification on the typed parameter list, lambda expressions cannot utilize templates. Fortunately, this trouble +only exists in C++11, starting with C++14. The formal parameters of the lambda function +can use the `auto` keyword to utilize template generics: ```cpp void lambda_generic() { @@ -221,8 +217,8 @@ int foo(int a, int b, int c) { ; } int main() { - // bind parameter 1, 2 on function foo, and use std::placeholders::_1 as placeholder - // for the first parameter. + // bind parameter 1, 2 on function foo, + // and use std::placeholders::_1 as placeholder for the first parameter. auto bindFoo = std::bind(foo, std::placeholders::_1, 1,2); // when call bindFoo, we only need one param left bindFoo(1); @@ -263,22 +259,34 @@ Temporary variables returned by non-references, temporary variables generated by operation expressions, original literals, and Lambda expressions are all pure rvalue values. -Note that a string literal became rvalue in a class, and remains an lvalue in other cases (e.g., in a function): +Note that a literal (except a string literal) is a prvalue. However, a string +literal is an lvalue with type `const char` array. Consider the following examples: ```cpp -class Foo { - const char*&& right = "this is a rvalue"; -public: - void bar() { - right = "still rvalue"; // the string literal is a rvalue - } -}; +#include int main() { - const char* const &left = "this is an lvalue"; // the string literal is an lvalue + // Correct. The type of "01234" is const char [6], so it is an lvalue + const char (&left)[6] = "01234"; + + // Assert success. It is a const char [6] indeed. Note that decltype(expr) + // yields lvalue reference if expr is an lvalue and neither an unparenthesized + // id-expression nor an unparenthesized class member access expression. + static_assert(std::is_same::value, ""); + + // Error. "01234" is an lvalue, which cannot be referenced by an rvalue reference + // const char (&&right)[6] = "01234"; } ``` +However, an array can be implicitly converted to a corresponding pointer.The result, if not an lvalue reference, is an rvalue (xvalue if the result is an rvalue reference, prvalue otherwise): + +```cpp +const char* p = "01234"; // Correct. "01234" is implicitly converted to const char* +const char*&& pr = "01234"; // Correct. "01234" is implicitly converted to const char*, which is a prvalue. +// const char*& pl = "01234"; // Error. There is no type const char* lvalue +``` + **xvalue, expiring value** is the concept proposed by C++11 to introduce rvalue references (so in traditional C++, pure rvalue and rvalue are the same concepts), a value that is destroyed but can be moved. @@ -339,11 +347,12 @@ void reference(std::string&& str) { int main() { std::string lv1 = "string,"; // lv1 is a lvalue - // std::string&& r1 = s1; // illegal, rvalue can't ref to lvalue + // std::string&& r1 = lv1; // illegal, rvalue can't ref to lvalue std::string&& rv1 = std::move(lv1); // legal, std::move can convert lvalue to rvalue std::cout << rv1 << std::endl; // string, - const std::string& lv2 = lv1 + lv1; // legal, const lvalue reference can extend temp variable's lifecycle + const std::string& lv2 = lv1 + lv1; // legal, const lvalue reference can + // extend temp variable's lifecycle // lv2 += "Test"; // illegal, const ref can't be modified std::cout << lv2 << std::endl; // string,string, @@ -374,7 +383,7 @@ int main() { } ``` -The first question, why not allow non-linear references to bind to non-lvalues? +The first question, why not allow non-constant references to bind to non-lvalues? This is because there is a logic error in this approach: ```cpp @@ -470,8 +479,9 @@ int main() { // "str: Hello world." std::cout << "str: " << str << std::endl; - // use push_back(const T&&), no copy - // the string will be moved to vector, and therefore std::move can reduce copy cost + // use push_back(const T&&), + // no copy the string will be moved to vector, + // and therefore std::move can reduce copy cost v.push_back(std::move(str)); // str is empty now std::cout << "str: " << str << std::endl; @@ -529,7 +539,7 @@ both lvalue and rvalue. But follow the rules below: | T&& | rvalue ref | T&& | Therefore, the use of `T&&` in a template function may not be able to make an rvalue reference, and when a lvalue is passed, a reference to this function will be derived as an lvalue. -More precisely, ** no matter what type of reference the template parameter is, the template parameter can be derived as a right reference type** if and only if the argument type is a right reference. +More precisely, **no matter what type of reference the template parameter is, the template parameter can be derived as a right reference type** if and only if the argument type is a right reference. This makes `v` successful delivery of lvalues. Perfect forwarding is based on the above rules. The so-called perfect forwarding is to let us pass the parameters, @@ -586,7 +596,7 @@ static_cast param passing: lvalue reference Regardless of whether the pass parameter is an lvalue or an rvalue, the normal pass argument will forward the argument as an lvalue. So `std::move` will always accept an lvalue, which forwards the call to `reference(int&&)` to output the rvalue reference. -Only `std::forward` does not cause any extra copies and ** perfectly forwards ** (passes) the arguments of the function to other functions that are called internally. +Only `std::forward` does not cause any extra copies and **perfectly forwards** (passes) the arguments of the function to other functions that are called internally. `std::forward` is the same as `std::move`, and nothing is done. `std::move` simply converts the lvalue to the rvalue. `std::forward` is just a simple conversion of the parameters. From the point of view of the phenomenon, diff --git a/book/en-us/05-pointers.md b/book/en-us/05-pointers.md index 38c167cc..dab6df95 100644 --- a/book/en-us/05-pointers.md +++ b/book/en-us/05-pointers.md @@ -57,27 +57,27 @@ And see the reference count of an object by `use_count()`. E.g: auto pointer = std::make_shared(10); auto pointer2 = pointer; // reference count+1 auto pointer3 = pointer; // reference count+1 -int *p = pointer.get(); // no increase of reference count +int *p = pointer.get(); // no increase of reference count -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 3 +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 3 std::cout << "pointer2.use_count() = " << pointer2.use_count() << std::endl; // 3 std::cout << "pointer3.use_count() = " << pointer3.use_count() << std::endl; // 3 pointer2.reset(); std::cout << "reset pointer2:" << std::endl; -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 2 +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 2 std::cout << "pointer2.use_count() = " - << pointer2.use_count() << std::endl; // 0, pointer2 has reset + << pointer2.use_count() << std::endl; // pointer2 has reset, 0 std::cout << "pointer3.use_count() = " << pointer3.use_count() << std::endl; // 2 pointer3.reset(); std::cout << "reset pointer3:" << std::endl; -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 1 +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 1 std::cout << "pointer2.use_count() = " << pointer2.use_count() << std::endl; // 0 std::cout << "pointer3.use_count() = " - << pointer3.use_count() << std::endl; // 0, pointer3 has reset + << pointer3.use_count() << std::endl; // pointer3 has reset, 0 ``` ## 5.3 `std::unique_ptr` @@ -85,7 +85,7 @@ std::cout << "pointer3.use_count() = " `std::unique_ptr` is an exclusive smart pointer that prohibits other smart pointers from sharing the same object, thus keeping the code safe: ```cpp -std::unique_ptr pointer = std::make_unique(10); // make_unique was introduced in C++14 +std::unique_ptr pointer = std::make_unique(10); // make_unique, from C++14 std::unique_ptr pointer2 = pointer; // illegal ``` diff --git a/book/en-us/06-regex.md b/book/en-us/06-regex.md index dd2963fb..e8b98547 100644 --- a/book/en-us/06-regex.md +++ b/book/en-us/06-regex.md @@ -34,32 +34,32 @@ and lowercase letters, all numbers, all punctuation, and some other symbols. A special character is a character with special meaning in a regular expression and is also the core matching syntax of a regular expression. See the table below: -| Special characters | Description | -| :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `$` | Matches the end position of the input string. | -| `(`,`)` | Marks the start and end of a subexpression. Subexpressions can be obtained for later use. | -| `*` | Matches the previous subexpression zero or more times. | -| `+` | Matches the previous subexpression one or more times. | -| `.` | Matches any single character except the newline character `\n`. | -| `[` | Marks the beginning of a bracket expression. | -| `?` | Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier. | -| `\` | Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character. | -| `^` | Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted. | -| `{` | Marks the beginning of a qualifier expression. | -| `\|` | Indicates a choice between the two. | +| Symbol | Description | +|:----------------:|:---| +| `$` | Matches the end position of the input string.| +| `(`,`)` | Marks the start and end of a subexpression. Subexpressions can be obtained for later use.| +| `*` | Matches the previous subexpression zero or more times. | +| `+` | Matches the previous subexpression one or more times.| +| `.` | Matches any single character except the newline character `\n`.| +| `[` | Marks the beginning of a bracket expression.| +| `?` | Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier.| +| `\` | Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character. | +| `^` | Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted.| +| `{` | Marks the beginning of a qualifier expression.| +| `\|` | Indicates a choice between the two.| ### Quantifiers The qualifier is used to specify how many times a given component of a regular expression must appear to satisfy the match. See the table below: -| Character | Description | -| :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `*` | matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`. | -| `+` | matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`. | -| `?` | matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`. | -| `{n}` | `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`. | -| `{n,}` | `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`. | -| `{n,m}` | `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. | +| Symbol | Description | +|:-------:|:-----| +| `*` | matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`.| +| `+` | matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`.| +| `?` | matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`.| +| `{n}` | `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`.| +| `{n,}` | `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`.| +| `{n,m}` | `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. | With these two tables, we can usually read almost all regular expressions. @@ -84,8 +84,9 @@ We use a simple example to briefly introduce the use of this library. Consider t int main() { std::string fnames[] = {"foo.txt", "bar.txt", "test", "a0.txt", "AAA.txt"}; - // In C++, `\` will be used as an escape character in the string. In order for `\.` - // to be passed as a regular expression, it is necessary to perform second escaping of `\`, thus we have `\\.` + // In C++, `\` will be used as an escape character in the string. + // In order for `\.` to be passed as a regular expression, + // it is necessary to perform second escaping of `\`, thus we have `\\.` std::regex txt_regex("[a-z]+\\.txt"); for (const auto &fname: fnames) std::cout << fname << ": " << std::regex_match(fname, txt_regex) << std::endl; @@ -104,7 +105,8 @@ std::smatch base_match; for(const auto &fname: fnames) { if (std::regex_match(fname, base_match, base_regex)) { // the first element of std::smatch matches the entire string - // the second element of std::smatch matches the first expression with brackets + // the second element of std::smatch matches the first expression + // with brackets if (base_match.size() == 2) { std::string base = base_match[1].str(); std::cout << "sub-match[0]: " << base_match[0].str() << std::endl; @@ -187,32 +189,36 @@ Please implement the member functions `start()` and `parse_request`. Enable serv template void start_server(SERVER_TYPE &server) { - // process GET request for /match/[digit+numbers], e.g. - // GET request is /match/abc123, will return abc123 - server.resource["fill_your_reg_ex"]["GET"] = [](ostream& response, Request& request) { + // process GET request for /match/[digit+numbers], + // e.g. GET request is /match/abc123, will return abc123 + server.resource["fill_your_reg_ex"]["GET"] = + [](ostream& response, Request& request) + { string number=request.path_match[1]; - response << "HTTP/1.1 200 OK\r\nContent-Length: " << number.length() + response << "HTTP/1.1 200 OK\r\nContent-Length: " << number.length() << "\r\n\r\n" << number; }; - // peocess default GET request; anonymous function will be called if no other matches - // response files in folder web/ + // peocess default GET request; + // anonymous function will be called + // if no other matches response files in folder web/ // default: index.html - server.default_resource["fill_your_reg_ex"]["GET"] = - [](ostream& response, Request& request) { - string filename = "www/"; - - string path = request.path_match[1]; - - // forbidden use `..` access content outside folder web/ - size_t last_pos = path.rfind("."); - size_t current_pos = 0; - size_t pos; - while((pos=path.find('.', current_pos)) != string::npos && pos != last_pos) { - current_pos = pos; - path.erase(pos, 1); - last_pos--; - } + server.default_resource["fill_your_reg_ex"]["GET"] = + [](ostream& response, Request& request) + { + string filename = "www/"; + + string path = request.path_match[1]; + + // forbidden use `..` access content outside folder web/ + size_t last_pos = path.rfind("."); + size_t current_pos = 0; + size_t pos; + while((pos=path.find('.', current_pos)) != string::npos && pos != last_pos) { + current_pos = pos; + path.erase(pos, 1); + last_pos--; + } // (...) }; diff --git a/book/en-us/07-thread.md b/book/en-us/07-thread.md index 8c5d0333..a9f41162 100644 --- a/book/en-us/07-thread.md +++ b/book/en-us/07-thread.md @@ -31,11 +31,11 @@ int main() { We have already learned the basics of concurrency technology in the operating system, or the database, and `mutex` is one of the cores. C++11 introduces a class related to `mutex`, with all related functions in the `` header file. -`std::mutex` is the most basic `mutex` class in C++11, and you can create a mutex by instantiating `std::mutex`. +`std::mutex` is the most basic mutex class in C++11, and a mutex can be created by constructing a `std::mutex` object. It can be locked by its member function `lock()`, and `unlock()` can be unlocked. But in the process of actually writing the code, it is best not to directly call the member function, Because calling member functions, you need to call `unlock()` at the exit of each critical section, and of course, exceptions. -At this time, C++11 also provides a template class `std::lock_guard` for the RAII syntax for the mutex. +At this time, C++11 also provides a template class `std::lock_guard` for the RAII mechanism for the mutex. RAII guarantees the exceptional security of the code while keeping the simplicity of the code. @@ -67,7 +67,9 @@ int main() { ``` Because C++ guarantees that all stack objects will be destroyed at the end of the declaration period, such code is also extremely safe. -Whether `critical_section()` returns normally or if an exception is thrown in the middle, a stack rollback is thrown, and `unlock()` is automatically called. +Whether `critical_section()` returns normally or if an exception is thrown in the middle, a stack unwinding is thrown, and `unlock()` is automatically called. + +> An exception is thrown and not caught (it is implementation-defined whether any stack unwinding is done in this case). `std::unique_lock` is more flexible than `std::lock_guard`. Objects of `std::unique_lock` manage the locking and unlocking operations on the `mutex` object with exclusive ownership (no other `unique_lock` objects owning the ownership of a `mutex` object). So in concurrent programming, it is recommended to use `std::unique_lock`. @@ -145,7 +147,8 @@ int main() { std::cout << "waiting..."; result.wait(); // block until future has arrived // output result - std::cout << "done!" << std:: endl << "future result is " << result.get() << std::endl; + std::cout << "done!" << std:: endl << "future result is " + << result.get() << std::endl; return 0; } ``` @@ -157,7 +160,7 @@ After encapsulating the target to be called, you can use `get_future()` to get a The condition variable `std::condition_variable` was born to solve the deadlock and was introduced when the mutex operation was not enough. For example, a thread may need to wait for a condition to be true to continue execution. A dead wait loop can cause all other threads to fail to enter the critical section so that when the condition is true, a deadlock occurs. -Therefore, the `condition_variable` instance is created primarily to wake up the waiting thread and avoid deadlocks. +Therefore, the `condition_variable` object is created primarily to wake up the waiting thread and avoid deadlocks. `notify_one()` of `std::condition_variable` is used to wake up a thread; `notify_all()` is to notify all threads. Below is an example of a producer and consumer model: @@ -196,7 +199,8 @@ int main() { // temporal unlock to allow producer produces more rather than // let consumer hold the lock until its consumed. lock.unlock(); - std::this_thread::sleep_for(std::chrono::milliseconds(1000)); // consumer is slower + // consumer is slower + std::this_thread::sleep_for(std::chrono::milliseconds(1000)); lock.lock(); if (!produced_nums.empty()) { std::cout << "consuming " << produced_nums.front() << std::endl; @@ -226,7 +230,7 @@ We simply can't expect multiple consumers to be able to produce content in a par ## 7.5 Atomic Operation and Memory Model Careful readers may be tempted by the fact that the example of the producer-consumer model in the previous section may have compiler optimizations that cause program errors. -For example, the boolean `notified` is not modified by `volatile`, and the compiler may have optimizations for this variable, such as the value of a register. +For example, the compiler may have optimizations for the variable `notified`, such as the value of a register. As a result, the consumer thread can never observe the change of this value. This is a good question. To explain this problem, we need to further discuss the concept of the memory model introduced from C++11. Let's first look at a question. What is the output of the following code? ```cpp @@ -272,8 +276,8 @@ This is a very strong set of synchronization conditions, in other words when it This seems too harsh for a variable that requires only atomic operations (no intermediate state). The research on synchronization conditions has a very long history, and we will not go into details here. Readers should understand that under the modern CPU architecture, atomic operations at the CPU instruction level are provided. -Therefore, in the C++11 multi-threaded shared variable reading and writing, the introduction of the `std::atomic` template, so that we instantiate an atomic type, will be a -Atomic type read and write operations are minimized from a set of instructions to a single CPU instruction. E.g: +Therefore, the `std::atomic` template is introduced in C++11 for the topic of multi-threaded shared variable reading and writing, which enables us to instantiate atomic types, +and minimize an atomic read or write operation from a set of instructions to a single CPU instruction. E.g: ```cpp std::atomic counter; @@ -417,8 +421,11 @@ Weakening the synchronization conditions between processes, usually we will cons ``` 3 4 4 4 // The write operation of x was quickly observed 0 3 3 4 // There is a delay in the observed time of the x write operation - 0 0 0 4 // The last read read the final value of x, but the previous changes were not observed. - 0 0 0 0 // The write operation of x is not observed in the current time period, but the situation that x is 4 can be observed at some point in the future. + 0 0 0 4 // The last read read the final value of x, + // but the previous changes were not observed. + 0 0 0 0 // The write operation of x is not observed in the current time period, + // but the situation that x is 4 can be observed + // at some point in the future. ``` ### Memory Orders @@ -468,7 +475,7 @@ To achieve the ultimate performance and achieve consistency of various strength As you can see, `std::memory_order_release` ensures that a write before a release does not occur after the release operation, which is a **backward barrier**, and `std::memory_order_acquire` ensures that a subsequent read or write after a acquire does not occur before the acquire operation, which is a **forward barrier**. For the `std::memory_order_acq_rel` option, combines the characteristics of the two barriers and determines a unique memory barrier, such that reads and writes of the current thread will not be rearranged across the barrier. - + Let's check an example: ```cpp @@ -480,9 +487,8 @@ To achieve the ultimate performance and achieve consistency of various strength }); std::thread acqrel([&]() { int expected = 1; // must before compare_exchange_strong - while(!flag.compare_exchange_strong(expected, 2, std::memory_order_acq_rel)) { + while(!flag.compare_exchange_strong(expected, 2, std::memory_order_acq_rel)) expected = 1; // must after compare_exchange_strong - } // flag has changed to 2 }); std::thread acquire([&]() { @@ -495,7 +501,7 @@ To achieve the ultimate performance and achieve consistency of various strength acquire.join(); ``` - In this case we used `compare_exchange_strong`, which is the Compare-and-swap primitive, which has a weaker version, `compare_exchange_weak`, which allows a failure to be returned even if the exchange is successful. The reason is due to a false failure on some platforms, specifically when the CPU performs a context switch, another thread loads the same address to produce an inconsistency. In addition, the performance of `compare_exchange_strong` may be slightly worse than `compare_exchange_weak`, but in most cases, `compare_exchange_strong` should be limited. + In this case we used `compare_exchange_strong`, which is the Compare-and-swap primitive, which has a weaker version, `compare_exchange_weak`, which allows a failure to be returned even if the exchange is successful. The reason is due to a false failure on some platforms, specifically when the CPU performs a context switch, another thread loads the same address to produce an inconsistency. In addition, the performance of `compare_exchange_strong` may be slightly worse than `compare_exchange_weak`. However, in most cases, `compare_exchange_weak` is discouraged due to the complexity of its usage. 4. Sequential Consistent Model: Under this model, atomic operations satisfy sequence consistency, which in turn can cause performance loss. It can be specified explicitly by `std::memory_order_seq_cst`. Let's look at a final example: diff --git a/book/zh-cn/00-preface.md b/book/zh-cn/00-preface.md index ec04257d..5e26f70e 100644 --- a/book/zh-cn/00-preface.md +++ b/book/zh-cn/00-preface.md @@ -11,7 +11,7 @@ order: 0 ## 引言 C++ 是一个用户群体相当大的语言。从 C++98 的出现到 C++11 的正式定稿经历了长达十年多之久的积累。C++14/17 则是作为对 C++11 的重要补充和优化,C++20 则将这门语言领进了现代化的大门,所有这些新标准中扩充的特性,给 C++ 这门语言注入了新的活力。 -那些还在坚持使用**传统 C++**(本书把 C++98 及其之前的 C++ 特性均称之为传统 C++)而未接触过现代 C++ 的 C++ 程序员在见到诸如 Lambda 表达式这类全新特性时,甚至会流露出『学的不是同一门语言』的惊叹之情。 +那些还在坚持使用**传统 C++** (本书把 C++98 及其之前的 C++ 特性均称之为传统 C++)而未接触过现代 C++ 的 C++ 程序员在见到诸如 Lambda 表达式这类全新特性时,甚至会流露出『学的不是同一门语言』的惊叹之情。 **现代 C++** (本书中均指 C++11/14/17/20) 为传统 C++ 注入的大量特性使得整个 C++ 变得更加像一门现代化的语言。现代 C++ 不仅仅增强了 C++ 语言自身的可用性,`auto` 关键字语义的修改使得我们更加有信心来操控极度复杂的模板类型。同时还对语言运行期进行了大量的强化,Lambda 表达式的出现让 C++ 具有了『匿名函数』的『闭包』特性,而这一特性几乎在现代的编程语言(诸如 Python/Swift/... )中已经司空见惯,右值引用的出现解决了 C++ 长期以来被人诟病的临时对象效率问题等等。 diff --git a/book/zh-cn/02-usability.md b/book/zh-cn/02-usability.md index a12bacb4..e9c84e9a 100644 --- a/book/zh-cn/02-usability.md +++ b/book/zh-cn/02-usability.md @@ -14,9 +14,9 @@ order: 2 ### nullptr -`nullptr` 出现的目的是为了替代 `NULL`。在某种意义上来说,传统 C++ 会把 `NULL`、`0` 视为同一种东西,这取决于编译器如何定义 `NULL`,有些编译器会将 `NULL` 定义为 `((void*)0)`,有些则会直接将其定义为 `0`。 +`nullptr` 出现的目的是为了替代 `NULL`。 C 与 C++ 语言中有**空指针常量**,它们能被隐式转换成任何指针类型的空指针值,或 C++ 中的任何成员指针类型的空成员指针值。 `NULL` 由标准库实现提供,并被定义为实现定义的空指针常量。在 C 中,有些标准库会把 `NULL` 定义为 `((void*)0)` 而有些将它定义为 `0`。 -C++ **不允许**直接将 `void *` 隐式转换到其他类型。但如果编译器尝试把 `NULL` 定义为 `((void*)0)`,那么在下面这句代码中: +C++ **不允许**直接将 `void *` 隐式转换到其他类型,从而 `((void*)0)` 不是 `NULL` 的合法实现。如果标准库尝试把 `NULL` 定义为 `((void*)0)`,那么下面这句代码中会出现编译错误: ```cpp char *ch = NULL; @@ -176,12 +176,13 @@ int main() { } // 将输出 1, 4, 3, 4 - for (std::vector::iterator element = vec.begin(); element != vec.end(); ++element) + for (std::vector::iterator element = vec.begin(); element != vec.end(); + ++element) std::cout << *element << std::endl; } ``` -在上面的代码中,我们可以看到 `itr` 这一变量是定义在整个 `main()` 的作用域内的,这导致当我们需要再次遍历整个 `std::vectors` 时,需要重新命名另一个变量。C++17 消除了这一限制,使得我们可以在 `if`(或 `switch`)中完成这一操作: +在上面的代码中,我们可以看到 `itr` 这一变量是定义在整个 `main()` 的作用域内的,这导致当我们需要再次遍历整个 `std::vector` 时,需要重新命名另一个变量。C++17 消除了这一限制,使得我们可以在 `if`(或 `switch`)中完成这一操作: ```cpp // 将临时变量放到 if 语句内 @@ -228,11 +229,13 @@ int main() { } ``` -为了解决这个问题,C++11 首先把初始化列表的概念绑定到了类型上,并将其称之为 `std::initializer_list`,允许构造函数或其他函数像参数一样使用初始化列表,这就为类对象的初始化与普通数组和 POD 的初始化方法提供了统一的桥梁,例如: +为解决这个问题,C++11 首先把初始化列表的概念绑定到类型上,称其为 `std::initializer_list`,允许构造函数或其他函数像参数一样使用初始化列表,这就为类对象的初始化与普通数组和 POD 的初始化方法提供了统一的桥梁,例如: ```cpp #include #include +#include + class MagicFoo { public: std::vector vec; @@ -247,7 +250,9 @@ int main() { MagicFoo magicFoo = {1, 2, 3, 4, 5}; std::cout << "magicFoo: "; - for (std::vector::iterator it = magicFoo.vec.begin(); it != magicFoo.vec.end(); ++it) std::cout << *it << std::endl; + for (std::vector::iterator it = magicFoo.vec.begin(); + it != magicFoo.vec.end(); ++it) + std::cout << *it << std::endl; } ``` @@ -309,8 +314,8 @@ C++11 引入了 `auto` 和 `decltype` 这两个关键字实现了类型推导, ```cpp // 在 C++11 之前 // 由于 cbegin() 将返回 vector::const_iterator -// 所以 itr 也应该是 vector::const_iterator 类型 -for(vector::const_iterator it = vec.cbegin(); itr != vec.cend(); ++it) +// 所以 it 也应该是 vector::const_iterator 类型 +for(vector::const_iterator it = vec.cbegin(); it != vec.cend(); ++it) ``` 而有了 `auto` 之后可以: @@ -349,17 +354,22 @@ auto i = 5; // i 被推导为 int auto arr = new auto(10); // arr 被推导为 int * ``` -从 C++ 20 起,`auto` 甚至能用于函数传参,考虑下面的例子: +从 C++ 14 起,`auto` 能用于 lambda 表达式中的函数传参,而 C++ 20 起该功能推广到了一般的函数。考虑下面的例子: ```cpp -int add(auto x, auto y) { +auto add14 = [](auto x, auto y) -> int { + return x+y; +} + +int add20(auto x, auto y) { return x+y; } -auto i = 5; // 被推导为 int -auto j = 6; // 被推导为 int -std::cout << add(i, j) << std::endl; +auto i = 5; // type int +auto j = 6; // type int +std::cout << add14(i, j) << std::endl; +std::cout << add20(i, j) << std::endl; ``` > @@ -408,7 +418,7 @@ type z == type x ### 尾返回类型推导 -你可能会思考,在介绍 `auto` 时,我们已经提过 `auto` 不能用于函数形参进行类型推导,那么 `auto` 能不能用于推导函数的返回类型呢?还是考虑一个加法函数的例子,在传统 C++ 中我们必须这么写: +你可能会思考, `auto` 能不能用于推导函数的返回类型呢?还是考虑一个加法函数的例子,在传统 C++ 中我们必须这么写: ```cpp template @@ -875,7 +885,7 @@ C++11 引入了 `override` 和 `final` 这两个关键字来防止上述情形 #### override -当重载虚函数时,引入 `override` 关键字将显式的告知编译器进行重载,编译器将检查基函数是否存在这样的虚函数,否则将无法通过编译: +当重载虚函数时,引入 `override` 关键字将显式的告知编译器进行重载,编译器将检查基函数是否存在这样的其函数签名一致的虚函数,否则将无法通过编译: ```cpp struct Base { @@ -953,7 +963,7 @@ enum class new_enum : unsigned int { ```cpp if (new_enum::value3 == new_enum::value4) { - // 会输出 + // 会输出true std::cout << "new_enum::value3 == new_enum::value4" << std::endl; } ``` diff --git a/book/zh-cn/03-runtime.md b/book/zh-cn/03-runtime.md index 857d8adc..94517dc7 100644 --- a/book/zh-cn/03-runtime.md +++ b/book/zh-cn/03-runtime.md @@ -69,15 +69,15 @@ void lambda_reference_capture() { #### 3. 隐式捕获 -手动书写捕获列表有时候是非常复杂的,这种机械性的工作可以交给编译器来处理,这时候可以在捕获列表中写一个 +手动书写捕获列表有时候是非常复杂的,这种机械性的工作可以交给编译器来处理,这时候可以在捕获列表中写一个 `&` 或 `=` 向编译器声明采用引用捕获或者值捕获. 总结一下,捕获提供了 Lambda 表达式对外部值进行使用的功能,捕获列表的最常用的四种形式可以是: - \[\] 空捕获列表 - \[name1, name2, ...\] 捕获一系列变量 -- \[&\] 引用捕获, 让编译器自行推导引用列表 -- \[=\] 值捕获, 让编译器自行推导值捕获列表 +- \[&\] 引用捕获, 从函数体内的使用确定引用捕获列表 +- \[=\] 值捕获, 从函数体内的使用确定值捕获列表 #### 4. 表达式捕获 @@ -107,11 +107,9 @@ void lambda_expression_capture() { ### 泛型 Lambda 上一节中我们提到了 `auto` 关键字不能够用在参数表里,这是因为这样的写法会与模板的功能产生冲突。 -但是 Lambda 表达式并不是普通函数,所以 Lambda 表达式并不能够模板化。 -这就为我们造成了一定程度上的麻烦:参数表不能够泛化,必须明确参数表类型。 - -幸运的是,这种麻烦只存在于 C++11 中,从 C++14 开始, -Lambda 函数的形式参数可以使用 `auto` 关键字来产生意义上的泛型: +但是 Lambda 表达式并不是普通函数,所以在没有明确指明参数表类型的情况下,Lambda 表达式并不能够模板化。 +幸运的是,这种麻烦只存在于 C++11 中,从 C++14 开始,Lambda 函数的形式参数可以使用 `auto` +关键字来产生意义上的泛型: ```cpp auto add = [](auto x, auto y) { @@ -136,7 +134,7 @@ Lambda 表达式的本质是一个和函数对象类型相似的类类型(称 #include using foo = void(int); // 定义函数类型, using 的使用见上一节中的别名语法 -void functional(foo f) { // 定义在参数列表中的函数类型 foo 被视为退化后的函数指针类型 foo* +void functional(foo f) { // 参数列表中定义的函数类型 foo 被视为退化后的函数指针类型 foo* f(1); // 通过函数指针调用函数 } @@ -193,7 +191,8 @@ int foo(int a, int b, int c) { ; } int main() { - // 将参数1,2绑定到函数 foo 上,但是使用 std::placeholders::_1 来对第一个参数进行占位 + // 将参数1,2绑定到函数 foo 上, + // 但使用 std::placeholders::_1 来对第一个参数进行占位 auto bindFoo = std::bind(foo, std::placeholders::_1, 1,2); // 这时调用 bindFoo 时,只需要提供第一个参数即可 bindFoo(1); @@ -213,34 +212,44 @@ int main() { 要弄明白右值引用到底是怎么一回事,必须要对左值和右值做一个明确的理解。 -**左值(lvalue, left value)**,顾名思义就是赋值符号左边的值。准确来说, +**左值** (lvalue, left value),顾名思义就是赋值符号左边的值。准确来说, 左值是表达式(不一定是赋值表达式)后依然存在的持久对象。 -**右值(rvalue, right value)**,右边的值,是指表达式结束后就不再存在的临时对象。 +**右值** (rvalue, right value),右边的值,是指表达式结束后就不再存在的临时对象。 而 C++11 中为了引入强大的右值引用,将右值的概念进行了进一步的划分,分为:纯右值、将亡值。 -**纯右值(prvalue, pure rvalue)**,纯粹的右值,要么是纯粹的字面量,例如 `10`, `true`; +**纯右值** (prvalue, pure rvalue),纯粹的右值,要么是纯粹的字面量,例如 `10`, `true`; 要么是求值结果相当于字面量或匿名临时对象,例如 `1+2`。非引用返回的临时变量、运算表达式产生的临时变量、 原始字面量、Lambda 表达式都属于纯右值。 -需要注意的是,字符串字面量只有在类中才是右值,当其位于普通函数中是左值。例如: +需要注意的是,字面量除了字符串字面量以外,均为纯右值。而字符串字面量是一个左值,类型为 `const char` 数组。例如: ```cpp -class Foo { - const char*&& right = "this is a rvalue"; // 此处字符串字面量为右值 -public: - void bar() { - right = "still rvalue"; // 此处字符串字面量为右值 - } -}; +#include int main() { - const char* const &left = "this is an lvalue"; // 此处字符串字面量为左值 + // 正确,"01234" 类型为 const char [6],因此是左值 + const char (&left)[6] = "01234"; + + // 断言正确,确实是 const char [6] 类型,注意 decltype(expr) 在 expr 是左值 + // 且非无括号包裹的 id 表达式与类成员表达式时,会返回左值引用 + static_assert(std::is_same::value, ""); + + // 错误,"01234" 是左值,不可被右值引用 + // const char (&&right)[6] = "01234"; } ``` -**将亡值(xvalue, expiring value)**,是 C++11 为了引入右值引用而提出的概念(因此在传统 C++ 中, +但是注意,数组可以被隐式转换成相对应的指针类型,而转换表达式的结果(如果不是左值引用)则一定是个右值(右值引用为将亡值,否则为纯右值)。例如: + +```cpp +const char* p = "01234"; // 正确,"01234" 被隐式转换为 const char* +const char*&& pr = "01234"; // 正确,"01234" 被隐式转换为 const char*,该转换的结果是纯右值 +// const char*& pl = "01234"; // 错误,此处不存在 const char* 类型的左值 +``` + +**将亡值** (xvalue, expiring value),是 C++11 为了引入右值引用而提出的概念(因此在传统 C++ 中, 纯右值和右值是同一个概念),也就是即将被销毁、却能够被移动的值。 将亡值可能稍有些难以理解,我们来看这样的代码: @@ -352,19 +361,19 @@ void foo() { class A { public: int *pointer; - A():pointer(new int(1)) { - std::cout << "构造" << pointer << std::endl; + A():pointer(new int(1)) { + std::cout << "构造" << pointer << std::endl; } - A(A& a):pointer(new int(*a.pointer)) { - std::cout << "拷贝" << pointer << std::endl; + A(A& a):pointer(new int(*a.pointer)) { + std::cout << "拷贝" << pointer << std::endl; } // 无意义的对象拷贝 - A(A&& a):pointer(a.pointer) { + A(A&& a):pointer(a.pointer) { a.pointer = nullptr; - std::cout << "移动" << pointer << std::endl; + std::cout << "移动" << pointer << std::endl; } - ~A(){ - std::cout << "析构" << pointer << std::endl; - delete pointer; + ~A(){ + std::cout << "析构" << pointer << std::endl; + delete pointer; } }; // 防止编译器优化 @@ -397,8 +406,8 @@ int main() { int main() { -std::string str = "Hello world."; -std::vector v; + std::string str = "Hello world."; + std::vector v; // 将使用 push_back(const T&), 即产生拷贝行为 v.push_back(str); @@ -514,8 +523,8 @@ static_cast 传参: 右值引用 static_cast 传参: 左值引用 ``` -无论传递参数为左值还是右值,普通传参都会将参数作为左值进行转发, -所以 `std::move` 总会接受到一个左值,从而转发调用了`reference(int&&)` 输出右值引用。 +无论传递参数为左值还是右值,普通传参都会将参数作为左值进行转发; +由于类似的原因,`std::move` 总会接受到一个左值,从而转发调用了`reference(int&&)` 输出右值引用。 唯独 `std::forward` 即没有造成任何多余的拷贝,同时**完美转发**(传递)了函数的实参给了内部调用的其他函数。 @@ -541,7 +550,7 @@ constexpr _Tp&& forward(typename std::remove_reference<_Tp>::type&& __t) noexcep ``` 在这份实现中,`std::remove_reference` 的功能是消除类型中的引用, -而 `std::is_lvalue_reference` 用于检查类型推导是否正确,在 `std::forward` 的第二个实现中 +`std::is_lvalue_reference` 则用于检查类型推导是否正确,在 `std::forward` 的第二个实现中 检查了接收到的值确实是一个左值,进而体现了坍缩规则。 当 `std::forward` 接受左值时,`_Tp` 被推导为左值,所以返回值为左值;而当其接受右值时, diff --git a/book/zh-cn/05-pointers.md b/book/zh-cn/05-pointers.md index b88428a9..ef7467f7 100644 --- a/book/zh-cn/05-pointers.md +++ b/book/zh-cn/05-pointers.md @@ -19,15 +19,15 @@ order: 5 也就是我们常说的 RAII 资源获取即初始化技术。 凡事都有例外,我们总会有需要将对象在自由存储上分配的需求,在传统 C++ 里我们只好使用 `new` 和 `delete` 去 -『记得』对资源进行释放。而 C++11 引入智能指针的概念,使用引用计数的想法,让程序员不再需要关心手动释放内存。 -这些智能指针就包括 `std::shared_ptr`/`std::unique_ptr`/`std::weak_ptr`,使用它们需要包含头文件 ``。 +『记得』对资源进行释放。而 C++11 引入了智能指针的概念,使用了引用计数的想法,让程序员不再需要关心手动释放内存。 +这些智能指针包括 `std::shared_ptr`/`std::unique_ptr`/`std::weak_ptr`,使用它们需要包含头文件 ``。 > 注意:引用计数不是垃圾回收,引用计数能够尽快收回不再被使用的对象,同时在回收的过程中也不会造成长时间的等待, > 更能够清晰明确的表明资源的生命周期。 ## 5.2 `std::shared_ptr` -`std::shared_ptr` 是一种智能指针,它能够记录多少个 `shared_ptr` 共同指向一个对象,从而消除显式的调用 +`std::shared_ptr` 是一种智能指针,它能够记录多少个 `shared_ptr` 共同指向一个对象,从而消除显式的调用 `delete`,当引用计数变为零的时候就会将对象自动删除。 但还不够,因为使用 `std::shared_ptr` 仍然需要使用 `new` 来调用,这使得代码出现了某种程度上的不对称。 @@ -59,21 +59,23 @@ int main() { auto pointer = std::make_shared(10); auto pointer2 = pointer; // 引用计数+1 auto pointer3 = pointer; // 引用计数+1 -int *p = pointer.get(); // 这样不会增加引用计数 -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 3 +int *p = pointer.get(); // 这样不会增加引用计数 +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 3 std::cout << "pointer2.use_count() = " << pointer2.use_count() << std::endl; // 3 std::cout << "pointer3.use_count() = " << pointer3.use_count() << std::endl; // 3 pointer2.reset(); std::cout << "reset pointer2:" << std::endl; -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 2 -std::cout << "pointer2.use_count() = " << pointer2.use_count() << std::endl; // 0, pointer2 已 reset +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 2 +std::cout << "pointer2.use_count() = " + << pointer2.use_count() << std::endl; // pointer2 已 reset; 0 std::cout << "pointer3.use_count() = " << pointer3.use_count() << std::endl; // 2 pointer3.reset(); std::cout << "reset pointer3:" << std::endl; -std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 1 +std::cout << "pointer.use_count() = " << pointer.use_count() << std::endl; // 1 std::cout << "pointer2.use_count() = " << pointer2.use_count() << std::endl; // 0 -std::cout << "pointer3.use_count() = " << pointer3.use_count() << std::endl; // 0, pointer3 已 reset +std::cout << "pointer3.use_count() = " + << pointer3.use_count() << std::endl; // pointer3 已 reset; 0 ``` ## 5.3 `std::unique_ptr` diff --git a/book/zh-cn/06-regex.md b/book/zh-cn/06-regex.md index b24da50b..a4c6597e 100644 --- a/book/zh-cn/06-regex.md +++ b/book/zh-cn/06-regex.md @@ -94,7 +94,8 @@ C++11 提供的正则表达式库操作 `std::string` 对象, int main() { std::string fnames[] = {"foo.txt", "bar.txt", "test", "a0.txt", "AAA.txt"}; - // 在 C++ 中 \ 会被作为字符串内的转义符,为使 \. 作为正则表达式传递进去生效,需要对 \ 进行二次转义,从而有 \\. + // 在 C++ 中 \ 会被作为字符串内的转义符, + // 为使 \. 作为正则表达式传递进去生效,需要对 \ 进行二次转义,从而有 \\. std::regex txt_regex("[a-z]+\\.txt"); for (const auto &fname: fnames) std::cout << fname << ": " << std::regex_match(fname, txt_regex) << std::endl; @@ -103,7 +104,7 @@ int main() { 另一种常用的形式就是依次传入 `std::string`/`std::smatch`/`std::regex` 三个参数, 其中 `std::smatch` 的本质其实是 `std::match_results`。 -在标准库中, `std::smatch` 被定义为了 `std::match_results`, +故而在标准库的实现中, `std::smatch` 被定义为了 `std::match_results`, 也就是一个子串迭代器类型的 `match_results`。 使用 `std::smatch` 可以方便的对匹配的结果进行获取,例如: @@ -193,16 +194,23 @@ protected: template void start_server(SERVER_TYPE &server) { - // process GET request for /match/[digit+numbers], e.g. GET request is /match/abc123, will return abc123 - server.resource["fill_your_reg_ex"]["GET"] = [](ostream& response, Request& request) { + // process GET request for /match/[digit+numbers], + // e.g. GET request is /match/abc123, will return abc123 + server.resource["fill_your_reg_ex"]["GET"] = + [](ostream& response, Request& request) + { string number=request.path_match[1]; - response << "HTTP/1.1 200 OK\r\nContent-Length: " << number.length() << "\r\n\r\n" << number; + response << "HTTP/1.1 200 OK\r\nContent-Length: " + << number.length() << "\r\n\r\n" << number; }; - // peocess default GET request; anonymous function will be called if no other matches - // response files in folder web/ + // peocess default GET request; + // anonymous function will be called + // if no other matches response files in folder web/ // default: index.html - server.default_resource["fill_your_reg_ex"]["GET"] = [](ostream& response, Request& request) { + server.default_resource["fill_your_reg_ex"]["GET"] = + [](ostream& response, Request& request) + { string filename = "www/"; string path = request.path_match[1]; diff --git a/book/zh-cn/07-thread.md b/book/zh-cn/07-thread.md index 6a402bc9..3c59a72d 100644 --- a/book/zh-cn/07-thread.md +++ b/book/zh-cn/07-thread.md @@ -11,7 +11,7 @@ order: 7 ## 7.1 并行基础 `std::thread` 用于创建一个执行的线程实例,所以它是一切并发编程的基础,使用时需要包含 `` 头文件, -它提供了很多基本的线程操作,例如 `get_id()` 来获取所创建线程的线程 ID,使用 `join()` 来加入一个线程等等,例如: +它提供了很多基本的线程操作,例如 `get_id()` 来获取所创建线程的线程 ID,使用 `join()` 来等待一个线程结束(与该线程汇合)等等,例如: ```cpp #include @@ -31,11 +31,12 @@ int main() { 我们在操作系统、亦或是数据库的相关知识中已经了解过了有关并发技术的基本知识,`mutex` 就是其中的核心之一。 C++11 引入了 `mutex` 相关的类,其所有相关的函数都放在 `` 头文件中。 -`std::mutex` 是 C++11 中最基本的 `mutex` 类,通过实例化 `std::mutex` 可以创建互斥量, +`std::mutex` 是 C++11 中最基本的互斥量类,可以通过构造 `std::mutex` 对象创建互斥量, 而通过其成员函数 `lock()` 可以进行上锁,`unlock()` 可以进行解锁。 但是在实际编写代码的过程中,最好不去直接调用成员函数, 因为调用成员函数就需要在每个临界区的出口处调用 `unlock()`,当然,还包括异常。 -这时候 C++11 还为互斥量提供了一个 RAII 语法的模板类 `std::lock_guard`。 +而 C++11 为互斥量提供了一个 RAII 机制的模板类 `std::lock_guard`。 + RAII 在不失代码简洁性的同时,很好的保证了代码的异常安全性。 在 RAII 用法下,对于临界区的互斥量的创建只需要在作用域的开始部分,例如: @@ -68,9 +69,11 @@ int main() { ``` 由于 C++ 保证了所有栈对象在生命周期结束时会被销毁,所以这样的代码也是异常安全的。 -无论 `critical_section()` 正常返回、还是在中途抛出异常,都会引发堆栈回退,也就自动调用了 `unlock()`。 +无论 `critical_section()` 正常返回、还是在中途抛出异常,都会引发栈回溯,也就自动调用了 `unlock()`。 + +> 没有捕获抛出的异常(此时由实现定义是否进行栈回溯)。 -而 `std::unique_lock` 则相对于 `std::lock_guard` 出现的,`std::unique_lock` 更加灵活, +而 `std::unique_lock` 则是相对于 `std::lock_guard` 出现的,`std::unique_lock` 更加灵活, `std::unique_lock` 的对象会以独占所有权(没有其他的 `unique_lock` 对象同时拥有某个 `mutex` 对象的所有权) 的方式管理 `mutex` 对象上的上锁和解锁的操作。所以在并发编程中,推荐使用 `std::unique_lock`。 @@ -147,7 +150,8 @@ int main() { std::cout << "waiting..."; result.wait(); // 在此设置屏障,阻塞到期物的完成 // 输出执行结果 - std::cout << "done!" << std:: endl << "future result is " << result.get() << std::endl; + std::cout << "done!" << std:: endl << "future result is " + << result.get() << std::endl; return 0; } ``` @@ -159,7 +163,7 @@ int main() { 条件变量 `std::condition_variable` 是为了解决死锁而生,当互斥操作不够用而引入的。 比如,线程可能需要等待某个条件为真才能继续执行, 而一个忙等待循环中可能会导致所有其他线程都无法进入临界区使得条件为真时,就会发生死锁。 -所以,`condition_variable` 实例被创建出现主要就是用于唤醒等待线程从而避免死锁。 +所以,`condition_variable` 对象被创建出现主要就是用于唤醒等待线程从而避免死锁。 `std::condition_variable`的 `notify_one()` 用于唤醒一个线程; `notify_all()` 则是通知所有线程。下面是一个生产者和消费者模型的例子: @@ -198,7 +202,8 @@ int main() { } // 短暂取消锁,使得生产者有机会在消费者消费空前继续生产 lock.unlock(); - std::this_thread::sleep_for(std::chrono::milliseconds(1000)); // 消费者慢于生产者 + // 消费者慢于生产者 + std::this_thread::sleep_for(std::chrono::milliseconds(1000)); lock.lock(); while (!produced_nums.empty()) { std::cout << "consuming " << produced_nums.front() << std::endl; @@ -230,7 +235,7 @@ int main() { ## 7.5 原子操作与内存模型 细心的读者可能会对前一小节中生产者消费者模型的例子可能存在编译器优化导致程序出错的情况产生疑惑。 -例如,布尔值 `notified` 没有被 `volatile` 修饰,编译器可能对此变量存在优化,例如将其作为一个寄存器的值, +例如,编译器可能对变量 `notified` 存在优化,例如将其作为一个寄存器的值, 从而导致消费者线程永远无法观察到此值的变化。这是一个好问题,为了解释清楚这个问题,我们需要进一步讨论 从 C++ 11 起引入的内存模型这一概念。我们首先来看一个问题,下面这段代码输出结果是多少? @@ -240,7 +245,7 @@ int main() { int main() { int a = 0; - int flag = 0; + volatile int flag = 0; std::thread t1([&]() { while (flag != 1); @@ -260,7 +265,7 @@ int main() { } ``` -从直观上看,`t2` 中 `a = 5;` 这一条语句似乎总在 `flag = 1;` 之前得到执行,而 `t1` 中 `while (flag != 1)` +从直观上看,`t2` 中 `a = 5;` 这一条语句似乎总在 `flag = 1;` 之前得到执行,而 `t1` 中 `while (flag != 1)` 似乎保证了 `std::cout << "b = " << b << std::endl;` 不会再标记被改变前执行。从逻辑上看,似乎 `b` 的值应该等于 5。 但实际情况远比此复杂得多,或者说这段代码本身属于未定义的行为,因为对于 `a` 和 `flag` 而言,他们在两个并行的线程中被读写, 出现了竞争。除此之外,即便我们忽略竞争读写,仍然可能受 CPU 的乱序执行,编译器对指令的重排的影响, @@ -277,9 +282,8 @@ int main() { 这是一组非常强的同步条件,换句话说当最终编译为 CPU 指令时会表现为非常多的指令(我们之后再来看如何实现一个简单的互斥锁)。 这对于一个仅需原子级操作(没有中间态)的变量,似乎太苛刻了。 -关于同步条件的研究有着非常久远的历史,我们在这里不进行赘述。读者应该明白,在现代 CPU 体系结构下提供了 CPU 指令级的原子操作, -因此在 C++11 中多线程下共享变量的读写这一问题上,还引入了 `std::atomic` 模板,使得我们实例化一个原子类型,将一个 -原子类型读写操作从一组指令,最小化到单个 CPU 指令。例如: +关于同步条件的研究有着非常久远的历史,我们在这里不进行赘述。读者应该明白,现代 CPU 体系结构提供了 CPU 指令级的原子操作, +因此在多线程下共享变量的读写这一问题上, C++11 中还引入了 `std::atomic` 模板,使得我们能实例化原子类型,并将一个原子写操作从一组指令,最小化到单个 CPU 指令。例如: ```cpp std::atomic counter; @@ -311,7 +315,7 @@ int main() { } ``` -当然,并非所有的类型都能提供原子操作,这是因为原子操作的可行性取决于 CPU 的架构以及所实例化的类型结构是否满足该架构对内存对齐 +当然,并非所有的类型都能提供原子操作,这是因为原子操作的可行性取决于具体的 CPU 架构,以及所实例化的类型结构是否能够满足该 CPU 架构对内存对齐 条件的要求,因而我们总是可以通过 `std::atomic::is_lock_free` 来检查该原子类型是否需支持原子操作,例如: ```cpp @@ -408,7 +412,7 @@ int main() { y.load() c = a + b x.store(3) ``` - 上面给出的三种例子都是属于因果一致的,因为整个过程中,只有 `c` 对 `a` 和 `b` 产生依赖,而 `x` 和 `y` + 上面给出的三种例子都是属于因果一致的,因为整个过程中,只有 `c` 对 `a` 和 `b` 产生依赖,而 `x` 和 `y` 在此例子中表现为没有关系(但实际情况中我们需要更详细的信息才能确定 `x` 与 `y` 确实无关) 4. 最终一致性:是最弱的一致性要求,它只保障某个操作在未来的某个时间节点上会被观察到,但并未要求被观察到的时间。因此我们甚至可以对此条件稍作加强,例如规定某个操作被观察到的时间总是有界的。当然这已经不在我们的讨论范围之内了。 @@ -419,7 +423,7 @@ int main() { T2 ---------+------------+--------------------+--------+--------> - x.read() x.read() x.read() x.read() + x.read x.read() x.read() x.read() ``` 在上面的情况中,如果我们假设 x 的初始值为 0,则 `T2` 中四次 `x.read()` 结果可能但不限于以下情况: @@ -428,7 +432,8 @@ int main() { 3 4 4 4 // x 的写操作被很快观察到 0 3 3 4 // x 的写操作被观察到的时间存在一定延迟 0 0 0 4 // 最后一次读操作读到了 x 的最终值,但此前的变化并未观察到 - 0 0 0 0 // 在当前时间段内 x 的写操作均未被观察到,但未来某个时间点上一定能观察到 x 为 4 的情况 + 0 0 0 0 // 在当前时间段内 x 的写操作均未被观察到, + // 但未来某个时间点上一定能观察到 x 为 4 的情况 ``` ### 内存顺序 @@ -489,9 +494,8 @@ int main() { }); std::thread acqrel([&]() { int expected = 1; // must before compare_exchange_strong - while(!flag.compare_exchange_strong(expected, 2, std::memory_order_acq_rel)) { + while(!flag.compare_exchange_strong(expected, 2, std::memory_order_acq_rel)) expected = 1; // must after compare_exchange_strong - } // flag has changed to 2 }); std::thread acquire([&]() { @@ -504,7 +508,7 @@ int main() { acquire.join(); ``` - 在此例中我们使用了 `compare_exchange_strong`,它便是比较交换原语(Compare-and-swap primitive),它有一个更弱的版本,即 `compare_exchange_weak`,它允许即便交换成功,也仍然返回 `false` 失败。其原因是因为在某些平台上虚假故障导致的,具体而言,当 CPU 进行上下文切换时,另一线程加载同一地址产生的不一致。除此之外,`compare_exchange_strong` 的性能可能稍差于 `compare_exchange_weak`,但大部分情况下,`compare_exchange_strong` 应该被有限考虑。 + 在此例中我们使用了 `compare_exchange_strong` 比较交换原语(Compare-and-swap primitive),它有一个更弱的版本,即 `compare_exchange_weak`,它允许即便交换成功,也仍然返回 `false` 失败。其原因是因为在某些平台上虚假故障导致的,具体而言,当 CPU 进行上下文切换时,另一线程加载同一地址产生的不一致。除此之外,`compare_exchange_strong` 的性能可能稍差于 `compare_exchange_weak`,但大部分情况下,鉴于其使用的复杂度而言,`compare_exchange_weak` 应该被有限考虑。 4. 顺序一致模型:在此模型下,原子操作满足顺序一致性,进而可能对性能产生损耗。可显式的通过 `std::memory_order_seq_cst` 进行指定。最后来看一个例子: @@ -553,7 +557,7 @@ C++11 语言层提供了并发编程的相关支持,本节简单的介绍了 ` ## 进一步阅读的参考资料 -- [C++ 并发编程\(中文版\)](https://www.gitbook.com/book/chenxiaowei/cpp_concurrency_in_action/details) +- [C++ 并发编程\(中文版\)](https://book.douban.com/subject/26386925/) - [线程支持库文档](https://en.cppreference.com/w/cpp/thread) - Herlihy, M. P., & Wing, J. M. (1990). Linearizability: a correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3), 463–492. https://doi.org/10.1145/78969.78972