Skip to content

[conv.lval] Make note and generalize comment on UB CWG2899 #7051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Eisenwave
Copy link
Contributor

The last sentence ([conv.lval] p3.4) is redundant at best, and defective at worst:

Otherwise, the object indicated by the glvalue is read ([defns.access]), and the value contained in the object is the prvalue result. If the result is an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.

It is misleading to single out erroneous values because this happens with any value (or lack thereof) in general. Consider the following example:

char c = 2;
bool b;
std::memcpy(&b, &c, 1);
bool u = b;

We could consider memcpy to implicitly begin the lifetime of a new bool object within the storage of b, and because b is transparently replaceable by such an object, this is almost valid. However, because no value of bool corresponds to a value representation of 0x02 (at least in the Itanium ABI), the behavior is undefined because the lvalue-to-rvalue conversion of b reads the value of b, but no such value exists.

The current wording only mentions erroneous values, and leads the reader to believe that only erroneous values can lead to value representations that don't correspond to any value, but this is not accurate.

To fix this, I've generalized the wording so that it doesn't single out erroneous values anymore, and made it into a note, since it's already UB based on the wording in [defns.access].

At worst, the current wording is defective because it talks about the "result" of the lvalue-to-rvalue conversion, but there cannot possibly be a result if the access is UB already.

@jensmaurer
Copy link
Member

jensmaurer commented Jun 5, 2024

"We could consider memcpy to implicitly begin the lifetime of a new bool..."

memcpy does not implicitly create objects. (corrected by Jan below) You're scribbling over the object representation of "b", which ends the lifetime of "b", and any access to "b" is thus undefined behavior.

Unless you show a real example where the situation can happen outside of erroneous values, this change is rejected. When CWG reviewed the wording not too long ago, we could not come up with an example. We believe we have sufficient wording elsewhere so that we cannot otherwise be in a valid situation where we try to read a bool from random garbage bytes.

Also, this changes the normative content of the standard (thus can't be an editorial change); I'm not seeing that [defns.access] by itself would somehow cause undefined behavior.

@Eisenwave
Copy link
Contributor Author

Eisenwave commented Jun 5, 2024

memcpy does not implicitly create objects.

[cstring.syn] p3 literally says:

Both functions [memcpy and memmove] implicitly create objects [...]

Unless you show a real example where the situation can happen outside of erroneous values

I have shown that example. The only reason why that code is UB is because 0x02 may not be a valid representation that corresponds to true or false. If it was, then the example would be well-defined. This is exactly the same reason why erroneous valued value representations may or may not correspond to a value for the type.

@jensmaurer
Copy link
Member

If we have another situation where a bad bit pattern can end up in a value representation and we can read from it before hitting undefined behavior earlier, the right fix is to strike the "erroneous value" part of the normative phrasing. That's not editorial, of course.

@Eisenwave
Copy link
Contributor Author

Eisenwave commented Jun 5, 2024

That's not editorial, of course.

I see it as editorial because it is already undefined behavior given that there is no value that can be read by lvalue-to-rvalue conversion in the first place. If the second sentence is redundant and any such case is already UB, it could be deleted (or turned into a note) and it wouldn't change the semantics of the standard or impact any implementation.

I guess this is getting too controversial for CWG not to see it though, and I should open a CWG issue about this.

@frederick-vs-ja
Copy link
Contributor

Even memcpy can implicitly create objects, per [intro.object] p11 it may create zero object and leave existing objects living. And the UB doesn't seem to depend on whether a bool object is (re-)created or the existing bool object merely gets its value representation replaced. It does only depend on a (probably) wrong object/value representation, IIUC.

FYI P2624 seemingly wants to eliminate such UB.

@jensmaurer
Copy link
Member

CWG2899

@jensmaurer jensmaurer changed the title [conv.lval] Make note and generalize comment on UB [conv.lval] Make note and generalize comment on UB CWG2899 Jun 6, 2024
@tkoeppe
Copy link
Contributor

tkoeppe commented Dec 15, 2024

The core issue has been approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants