Skip to content

bpo-40686: Fix compiler warnings on _zoneinfo.c #23614

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 16, 2020
Merged

bpo-40686: Fix compiler warnings on _zoneinfo.c #23614

merged 1 commit into from
Dec 16, 2020

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Dec 2, 2020

"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix compiler the following warnings on Windows:

modules_zoneinfo.c(1224): warning C4068: unknown pragma
modules_zoneinfo.c(1225): warning C4068: unknown pragma
modules_zoneinfo.c(1227): warning C4068: unknown pragma

https://bugs.python.org/issue40686

@vstinner vstinner changed the title bpo-40686! Fix compiler warnings on _zoneinfo.c bpo-40686: Fix compiler warnings on _zoneinfo.c Dec 2, 2020
@vstinner
Copy link
Member Author

vstinner commented Dec 2, 2020

Here is my very pragmatic solution to fix _zoneinfo.c warnings.

The purpose of this change is to fix the 3 last compielr warnings on Windows, so it will be easier to spot new warnings.

I plan to merge this PR at the end of the week, unless someone really wants a more complex solution for a non-problem (day parameter is unsigned, is cannot be negative...).

cc @pganssle @corona10 @serhiy-storchaka

@vstinner
Copy link
Member Author

vstinner commented Dec 2, 2020

I plan to merge this PR at the end of the week, unless someone really wants a more complex solution for a non-problem (day parameter is unsigned, is cannot be negative...).

Hum. Well, as I explained in previous PRs, I understand the problem, but the C language doesn't provide an easy way to check if a type is unsigned or not. The main issue is to get the type of a variable, there is no simple solution for that: see PR #20624. Python tries to support all C compilers, so we cannot easily rely on "recent" (!) C language features, or compiler extensions.

I consider that adding 20 to 30 lines of complex macros just in case if someone changes the day parameter type for fun, without checking if day is negative, is not worth it.

I replaced the complex compiler pragmas and the long comment with:

    // If the 'day' parameter type is changed to a signed type,
    // "day < 0" check must be added.
    if (day > 6) {
        PyErr_Format(PyExc_ValueError, "Day must be in [0, 6]");
        return -1;
    }

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix also the declaration of zone_from_strong_cache as in #20619.

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a pragmatic solution because it's not a solution. It doesn't solve the core problem, which is that logically these are not the right bounds to check. The only reason you are doing this is because some compilers emit erroneous warnings. Please solve the erroneous warnings problem rather than changing the code.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix compiler the following warnings on Windows:

modules\_zoneinfo.c(1224): warning C4068: unknown pragma
modules\_zoneinfo.c(1225): warning C4068: unknown pragma
modules\_zoneinfo.c(1227): warning C4068: unknown pragma
@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pganssle: please review the changes made to this pull request.

@bedevere-bot bedevere-bot requested a review from pganssle December 3, 2020 08:14
@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

It doesn't solve the core problem, which is that logically these are not the right bounds to check.

In C, when a type is unsigned, it cannot be negative. So day < 0 condition is always true. Compilers emit a warning on such condiiton, and such false alarm is annoying since there is a risk of missing real bugs if Python emits too many compiler warnings.

Let's agree to disagree that no checking explicitly that day type is signed or ensure that day >= 0 is not an issue according to me, but it matters to you. C is not Rust, the language is old and as I explained, we cannot use recent C features because we care of maximum platform compatibility.

Different solutions have been discussed, but there is a disagreement on them.

@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

@serhiy-storchaka: If you like this approach, would you mind to "officially" approve my PR? There is a disagreement and I would prefer to get a least one approval on my PR before merging it.

@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

I changed my PR to keep the day < 0 check... as a comment, to make it even more explicit:

    // If the 'day' parameter type is changed to a signed type,
    // "day < 0" check must be added.
    if (/* day < 0 || */ day > 6) {

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 on Victor's approach
I want to know there is any possibility of uint8_t can be a negative value.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Let's get this over with!

@serhiy-storchaka serhiy-storchaka added the needs backport to 3.9 only security fixes label Dec 3, 2020
@serhiy-storchaka
Copy link
Member

I feel very uncomfortable that we are going to commit changes to which one of the core developers and the author of this code is in opposition. But the existing code contains obvious errors and we need to fix them. I hope Paul understands our motives.

@corona10
Copy link
Member

corona10 commented Dec 3, 2020

I hope Paul understands our motives.

Same here ;) I always respect Paul's insight and contribution!

@pganssle
Copy link
Member

pganssle commented Dec 3, 2020

I already had a working solution months ago and after going through several rounds of review, Victor suddenly changed his mind about it. It seemed clear at that time that this was not important.

This is not an "obvious error", it is simply an overzealous linter. If we're going to impose the condition that any compiler cannot generate any warning, we need a way to disable those warnings that is considerably less burdensome than this one, in my opinion.

Copy link
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are useless for this, for reasons I've been very clear about.

Obviously this won't be backported, which means committing this will make it harder to maintain the backport. If you are going to remove the lower bounds check, please do it in a way that is robust to changes in the code.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

If you are going to remove the lower bounds check, please do it in a way that is robust to changes in the code.

Why are you making the assumption that developers are stupid and change randomly types of a parameter without thinking at the impact of their change? Why not making the assumption that if someone decides to change the type, they will review the code and check if it can introduce a change? Well, I don't see why someone would like to change the parameter type in the first place.

We have reviews on our workflow. If a bug is not noticed during, it can be fixed later. I don't think that it's a big deal.

Obviously this won't be backported

Why not? I can backport it to 3.9.

@serhiy-storchaka
Copy link
Member

The current code uses the compiler-specific pragma without checking the compiler. This is an error which should be fixed. After fixing this error we can discuss how to guard against hypothetical future errors.

@serhiy-storchaka
Copy link
Member

The alternative is reverting changes which introduced that error until we find a compromise. I.e. removing the _zoneinfo module. But I think it is too late for this.

@pganssle
Copy link
Member

pganssle commented Dec 3, 2020

The current code uses the compiler-specific pragma without checking the compiler. This is an error which should be fixed. After fixing this error we can discuss how to guard against hypothetical future errors.

I already have working code in #20624 that both removes the pragma and guards against the relevant errors. We should either use that or @corona10's solution with the absurdly long stack of pragmas. It did not seem a high priority when I tried to fix this months ago and was given the runaround, so I don't see why the big rush now.

Why are you making the assumption that developers are stupid and change randomly types of a parameter without thinking at the impact of their change? Why not making the assumption that if someone decides to change the type, they will review the code and check if it can introduce a change? Well, I don't see why someone would like to change the parameter type in the first place.

I could easily imagine myself causing this problem, so I don't love the implication that you think I'd have to be "stupid" to cause it.

The reason why we shouldn't change this is that this is one line that is very disconnected from the places you'd change this parameter type. Even if you only changed the parameter type in this function alone (and its callers), it wouldn't even show up on the diff, which means it's much less likely that someone would go through and notice that the type itself makes up part of the logical bounds of this particular check.

And why should they need to? We have a way to tell the compiler exactly what the logical bounds are —- by including them in the if clause. We also have the ability to tell the compiler that the code in question relies on the fact that it's an unsigned type and immediately alert any human changing the types that there will be a problem. There is no reason to rely on a human to do work a computer can do.

Also, commonly types will not change "randomly", but as part of some larger refactoring of the code. It is not at all unreasonable to imagine a future where we switch to all signed types for one of these structs, which means that not only would you need to personally know in detail all the ways that changing the types could break in one function, but potentially in all the code. This is one reason why we have tests and build redundancy in our code — so that we can refactor with more confidence knowing that we don't actually understand the full implications of our changes, but we've hopefully engineered things so that it won't be a problem (or it will fail loudly before it hits the users).

You are asking to make our code worse and increase the burden for checking thousands of lines of C code because this particular right thing to do kinda looks like a thing that is often the wrong thing to do. I think that is unreasonable.

Why not? I can backport it to 3.9.

I meant it won't go into backports.zoneinfo, which is already difficult enough to maintain.

@corona10
Copy link
Member

corona10 commented Dec 3, 2020

@pganssle @vstinner @serhiy-storchaka

I compared the assembly code by using the part of the code.
But it looks like there is no difference between negative checking and nonchecking code on GCC/clang with optimization option.
(I didn't check all of the optimization options)

https://godbolt.org/z/Wqz9hh

@pganssle
Copy link
Member

pganssle commented Dec 3, 2020

I compared the assembly code by using the part of the code.
But it looks like there is no difference between negative checking and nonchecking code on GCC/clang with optimization option.
(I didn't check all of the optimization options)

Yes, I would be surprised if there were any difference, because I expect the compiler to know that in this case it's not possible for the value to be <0 and elide that check. Even if it didn't elide the check, I don't think it would matter anyway, since this would not be anywhere close to the bottleneck in performance here.

The difference between including the < 0 and not including it is that the actual logical bounds here are day is expected to be in [0, 6], which is properly captured by erroring on day < 0 || day > 6. Because day happens to be a uint8_t, day < 0 cannot occur, and so day > 6 is sufficient. If, however, we ever switched to int8_t or some other signed type for some reason (e.g. to add error codes or something), suddenly it would not be appropriate to elide the day < 0 portion and the same code would start emitting different ouputs. Only the one with day < 0 would be correct.

I understand the value of keeping the code warning-free. The warnings are usually valuable. Even this warning about not comparing < 0 is valuable, because frequently it occurs when someone doesn't realize that the variable they are testing is unsigned, and they are relying on it being signed for some reason. However, I think we need a way to indicate that "Yes, I acknowledge that bugs like that exist, but this is not one of them."

Long term, I'd like to explore options for us actually disabling errors in sections of the code in a way that's not horribly cumbersome, because I think that's really the only reasonable way to use a linter like that. In the short term, I'm satisfied with adding static assertions about the signedness of types if we're going to be relying on them (or, to a lesser extent, adding tests that would hit the error condition if it's possible to do so from the public API).

@serhiy-storchaka
Copy link
Member

Maybe merge #20619 as compromise?

#20624 is more complex, and I am not sure about its portability.

@vstinner
Copy link
Member Author

vstinner commented Dec 3, 2020

Serhiy:

Maybe merge #20619 as compromise?

PR #20619 fix the warning on Windows: https://bugs.python.org/issue40686 It disables the warning when using GCC or clang.

But other C compiler (Visual Studio: MSC, XLC? ICC?) that Python supports may also emit a compiler warning (today or tomorrow). This approach is not future-proof, Python may support more C (or C++?) compilers tomorrow, or the pragma might need to be adapted.

The pragmas are used to hide this GCC warning:

Modules/_zoneinfo.c: In function ‘calendarrule_new’:
Modules/_zoneinfo.c:1222:13: warning: comparison is always false due to limited range of data type [-Wtype-limits]

My PR removes the test which causes the warning.

@pganssle
Copy link
Member

pganssle commented Dec 3, 2020

The pragmas are used to hide this GCC warning:
...
My PR removes the test which causes the warning.

Yes, but the problem is that the warning is erroneous. There is nothing actually wrong with the check, because the test accurately describes the logical bounds of the check in a way the compiler can understand. The compiler can realize that one of the preconditions is already met because of the nature of the type and optimize it away if necessary.

The warning exists because these "practically unnecessary" tests are often a sign that you've made a mistake because you are expecting a signed value and not an unsigned value, like this:

void f(uint8_t value) {
    if (value < 0) {
        // This indicates that the interface may be expecting -1 for an error or something,
        // but may not realize that due to underflow `value < 0` will never match.
        printf("An error has occurred!\n");
        return;
    }
    printf("The value is %d\n", value);
}

It's useful to have the warning in cases where the conditional is pointless, but when it's simply redundant and particularly when it's deliberate, these warnings are actively encouraging bad code. This is one reason why they are warnings and not errors. I personally prefer the static assertion approach to the one with all the excess pragmas, but the only one I'm actively opposed to is the one where we make the code worse to shut up the linter.

#20624 is more complex, and I am not sure about its portability.

I think it doesn't work on every platform, but IMO it doesn't need to work on every platform as long as it doesn't generate errors or warnings on those platforms. IIRC I wrote it in such a way that on platforms where the static assertion mechanism I used is unsupported, it falls back to a no-op (effectively equivalent to Victor's code). I'm fine with that because as long as the compilation fails CI on some supported platforms, we'll be alerted to any bugs automatically, which is the whole point of the static assertion code.

The "huge pile of pragmas" approach makes the code hard to read, which is the downside there. If we could move the pragmas into a macro somewhere (like I was able to do with the static assertion code), that would be my preferred solution (since it would presumably be reusable), but I think the issue is with portability and the fact that not all platforms are likely to support pragmas in macros. Unlike the static assertion code, where we only really need the static assertion code to work properly on one platform that we test on, the "suppress the error" code needs to work on all the platforms.

@vstinner vstinner merged commit aefb69b into python:master Dec 16, 2020
@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@vstinner vstinner deleted the fix_zoneinfo_warn branch December 16, 2020 15:26
@bedevere-bot
Copy link

GH-23804 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Dec 16, 2020
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Dec 16, 2020
"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix the following warnings on Windows:

modules\_zoneinfo.c(1224): warning C4068: unknown pragma
modules\_zoneinfo.c(1225): warning C4068: unknown pragma
modules\_zoneinfo.c(1227): warning C4068: unknown pragma
(cherry picked from commit aefb69b)

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner
Copy link
Member Author

vstinner commented Dec 16, 2020

While Paul, the author of _zoneinfo.c, disagrees with this solution, bpo-40686 is open since May and no better solution was approved.

This PR was approved by two other core developers. As @serhiy-storchaka wrote, there is still room for enhancement, but this change simply fix https://bugs.python.org/issue40686 : fix last Python compiler warnings, so we can focus on other warnings which can be real bugs.

I wrote some notes about this issue at https://bugs.python.org/issue40686#msg383175

IMO we should just give us, C language is not designed for check static checks and it's ok to have such limitation.

@pganssle
Copy link
Member

To be clear, I think it was entirely inappropriate of @vstinner to merge this without further discussion.

I consider this a regression in the code.

vstinner added a commit that referenced this pull request Dec 16, 2020
"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix the following warnings on Windows:

modules\_zoneinfo.c(1224): warning C4068: unknown pragma
modules\_zoneinfo.c(1225): warning C4068: unknown pragma
modules\_zoneinfo.c(1227): warning C4068: unknown pragma
(cherry picked from commit aefb69b)

Co-authored-by: Victor Stinner <vstinner@python.org>

Co-authored-by: Victor Stinner <vstinner@python.org>
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
"uint8_t day" is unsigned and so "day < 0" test is always true.
Remove the test to fix the following warnings on Windows:

modules\_zoneinfo.c(1224): warning C4068: unknown pragma
modules\_zoneinfo.c(1225): warning C4068: unknown pragma
modules\_zoneinfo.c(1227): warning C4068: unknown pragma
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants