Skip to content

Tracker: protect macros expansions using do { ... } while (0) constructions or via static inline equivalents when possible #124044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
picnixz opened this issue Sep 13, 2024 · 8 comments
Assignees
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@picnixz
Copy link
Member

picnixz commented Sep 13, 2024

Fortifying Macros Expansion

The Problem

In C, using a macro allows to reduce code duplication. However, depending on the macro's body, the moment they are expanded by the C preprocessor, the obtained C code may not be syntactally correct (or could have an unexpected execution flow).

In #123842, I saw that there are many files that have macros that could wrongly expand at compile time depending on how they are being used. Most of the time it's harmless but we never know how contributors use the macros themselves (they may not be aware of PEP-7 and could end up with dangling else in some cases). I will continue working on that draft PR but it's really hard to find a good regex for finding the macros to protect :') (well, I could do something with some analyzer but I'm too lazy).

Since I'm tired of creating separate issues for each module and macro to protect, I will use this single issue to organize and track the progress.

Important

We only touch source files (.c). We do NOT change the macros in header files (.h). Those files can be included by anyone and the macros could be used by some downstream projects. Since we do not know how they actually use them, we don't want to break them (and there is no way we can notify them of the change except via commits). We will have dedicated issues to fix those macros as a wider discussion could be needed, but this specific issue only focuses on source files.

The Good News

Good news is that we have not one but two approaches for fixing this.

Solution 1: converting the macro into a static inline function

When possible, prefer converting macros into static inline functions. Similar work has been achieved in PEP-670 for the C API and the overhead is not that much. Smart compilers would anyway inline the function, making it (almost) assembly-equivalent to a macro.

Not all macros can be converted into static inline functions. Macros that simply compute, say the addition of two double as ADD_DOUBLE(x, y) (x) + (y) should be rewritten as

static inline double
add_double(double x, double y)
{ 
    return x + y;
}

There are various pitfalls when using macros (those pitfalls are explained in detail in PEP-670) so converting macros into static inline functions should be preferred. On the other hand, macros that have a return or a goto instruction cannot be easily converted into static inline functions.

Important

I'd advise to carefully read PEP-670 to see when to convert a macro into a static inline function.

Note

When converting macros into static inline functions, the case of the name can be changed. Most macros have uppercase names, but functions are lowercased. I'd suggest having a different commit for the rename so that it could easily be reverted if needed.

Solution 2: using do { ... } while (0) constructions

In addition to the arguments exposed in https://www.quora.com/What-is-the-purpose-of-using-do-while-0-in-macros, the reason why we use do { <body> } while (0) is that the compiler is smart enough to optimize it as { <body> } but after having expanded the macro (we can neither use #define MACRO(...) if (1) { <body> } nor #define MACRO(...) { <body> } because we could create dangling else if we were to use MACRO(); instead of MACRO()). Now, I will lay down the convention I took for solving those issues.

When possible, apply PEP 7, namely, align line continuations and put do on a separate column. For instance, we change:

#define MACRO(...) \
    <FIRST LINE> \
    <SECOND LINE> \
    <LAST LINE>
// OR
#define MACRO(...) { \
    <FIRST LINE> \
    <SECOND LINE> \
    <LAST LINE> \
}
// OR
#define MACRO(...) \
{ \
    <FIRST LINE> \
    <SECOND LINE> \
    <LAST LINE> \
}

into

#define MACRO(...)      \
    do {                \
        <FIRST LINE>    \
        <SECOND LINE>   \
        <LAST LINE>     \
    } while (0)

By convention, I put do on a separate line and indent it so that we don't have } while (0) stick to left margin. If other macros in the file are already using the do { ... } while (0) construction, either I keep the same style (which is probably preferred) or I change them if they are in minority (but using a separate commit so that I can easily revert it if a core-dev does not want this kind of cosmetic change).1

Tip

To align line continuations easily, take the longest line after adding the do { ... } while (0) construction and press TAB. Most of the time, this will align the \ on the next multiple of 4. Then either enter vertical mode to edit the line continuations or put tabs before every other \ until getting aligned line continuations.

I personally avoid adding a semicolon after while (0). One reason is that it convenes the intention for the usage to behave like a C statement. This is also in line with what PEP 7 suggests (but does not enforce). We could add it but then both MACRO(...) and MACRO(...); could be found in the code and I think it's better to have something uniform (ideally MACRO(...);).

Note

The modules that will be touched may be very old and they could have their own coding style. I don't have a strong opinion on whether we should PEP-7-ize the macro body if the rest of the code is not PEP-7 compliant, but I think that aligning the line continuations should be preferred because it makes the macro easier to read.

Important

Avoid touching other places of the code. If we believe there is a bug in the macro, we open a separate issue. In addition, each fix should only focus on a single file (or a single macro that is duplicated across modules). Ideally, we should only make PRs one by one because this could create conflicts on other branches and our work usually has a lower priority. For that reason, we also don't backports the changes.

Linked PRs

Footnotes

  1. Consistency beats purity but sometimes consistency could be forced (if there are 50 macros for which we add do { ... } while (0) but there is only 1 macro that is putting the do at the same level as the #define, we simply change that single macro).

@ZeroIntensity
Copy link
Member

We do NOT change the macros in header files (.h). Those files can be included by anyone and the macros could be used by some downstream projects.

Wouldn't we want to make sure that macros are safe for downstream users? I can't think of a case where switching to do { ... } while (0) would break use of macros downstream (in fact, some users might be broken by not have that there already).

@picnixz
Copy link
Member Author

picnixz commented Sep 13, 2024

Wouldn't we want to make sure that macros are safe for downstream users?

I don't know if people are using it wrongly or not and I don't want (yet) to break anything.

I can't think of a case where switching to do { ... } while (0) would break use of macros downstream

One possibility is if they actually patch the macro in such a way that the construction wouldn't be syntactically correct anymore.

(in fact, some users might be broken by not have that there already).

You're right but changing on our side may not necessarily help them more. If possible, I want to address each separate macro in .h using a separate issue so that we can discuss and make reporting downstream issues more easily. Also, I'm not even sure that there are macros that actually need this kind of protection in .h files (I didn't really search for them but I think it's quite rare to have macros in the (exposed) header files, except those in py_macro.h (or something like that)).

@vstinner
Copy link
Member

@erlend-aasland and me converted a bunch of macros to static inline functions in PEP 670. For some macros, it means that they cannot be used in expressions anymore, only as statements. For example, PyList_SET_ITEM() has no value anymore (the return type of the static inline function is void).

See also rejected PEP 674.

@vstinner
Copy link
Member

Ah, see also _Py_RVALUE(expr) which prevents using an expression as an l-value.

@picnixz
Copy link
Member Author

picnixz commented Sep 13, 2024

Ah, then I'll add a paragraph tomorrow saying that, if possible, macros should be converted into static inline functions.

@picnixz picnixz changed the title Tracker: protect macros expansions using do { ... } while (0) constructions Tracker: protect macros expansions using do { ... } while (0) constructions or via static inline equivalents Sep 13, 2024
@vstinner
Copy link
Member

Not all macros can be converted. Like macros containing return or goto.

@picnixz picnixz changed the title Tracker: protect macros expansions using do { ... } while (0) constructions or via static inline equivalents Tracker: protect macros expansions using do { ... } while (0) constructions or via static inline equivalents when possible Sep 14, 2024
@picnixz
Copy link
Member Author

picnixz commented Sep 14, 2024

I added the clarification in the title. I'll modify the body's issue now.

@picnixz picnixz removed the easy label Sep 14, 2024
@picnixz
Copy link
Member Author

picnixz commented Mar 16, 2025

I think I'm going to close this one. No need to keep it opened as it's mostly cosmetics. However, if I encounter a case where it could cause an issue, I'll reuse this issue number.

@picnixz picnixz closed this as completed Mar 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants