Skip to content

ctypes: clearly document how structure bit fields are allocated #57089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
meadori opened this issue Sep 2, 2011 · 11 comments
Closed

ctypes: clearly document how structure bit fields are allocated #57089

meadori opened this issue Sep 2, 2011 · 11 comments
Labels
docs Documentation in the Doc dir topic-ctypes type-feature A feature request or enhancement

Comments

@meadori
Copy link
Member

meadori commented Sep 2, 2011

BPO 12880
Nosy @terryjreedy, @meadori
Files
  • bitfield_doc.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2011-09-02.02:34:11.223>
    labels = ['ctypes', 'type-feature', 'docs']
    title = 'ctypes: clearly document how structure bit fields are allocated'
    updated_at = <Date 2011-10-04.23:51:15.205>
    user = 'https://github.com/meadori'

    bugs.python.org fields:

    activity = <Date 2011-10-04.23:51:15.205>
    actor = 'terry.reedy'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation', 'ctypes']
    creation = <Date 2011-09-02.02:34:11.223>
    creator = 'meador.inge'
    dependencies = []
    files = ['23275']
    hgrepos = []
    issue_num = 12880
    keywords = ['patch']
    message_count = 10.0
    messages = ['143369', '143846', '144696', '144854', '144855', '144894', '144911', '144913', '144919', '144932']
    nosy_count = 4.0
    nosy_names = ['terry.reedy', 'meador.inge', 'docs@python', 'vladris']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12880'
    versions = ['Python 3.3']

    Linked PRs

    @meadori
    Copy link
    Member Author

    meadori commented Sep 2, 2011

    As issues like bpo-6069 and bpo-11920 allude to, figuring out how 'ctypes' allocates bit-fields is not very clear. The documentation should be enhanced to flesh this out in more detail. As an example, Microsoft documents the VC++ implementation in a reasonably clear manner ( http://msdn.microsoft.com/en-us/library/ewwyfdbe(v=vs.71).aspx ).

    @meadori meadori added docs Documentation in the Doc dir type-feature A feature request or enhancement labels Sep 2, 2011
    @meadori
    Copy link
    Member Author

    meadori commented Sep 11, 2011

    Another example of desperately needed documentation: bpo-12945.

    @vladris
    Copy link
    Mannequin

    vladris mannequin commented Sep 30, 2011

    Attached doc update against tip (though I still hope my patch for configurable allocation strategies will make it in).

    This is my first doc patch so let me know if I missed something. I am basically explaining that bit field allocation is compiler-specific and no assumptions should be made of how a bitfield is allocated.

    I believe this is the better thing to do rather than detailing how GCC and MSVC allocated their bitfields because that would just encourage people to use this feature incorrectly. Most bugs opened on bit fields are because people are toying with the underlying buffer and get other results than what they expect. IMO, when using bitfields one should only access the structure members at a high level and not go read/write the raw memory underneath.

    @meadori
    Copy link
    Member Author

    meadori commented Oct 4, 2011

    On Fri, Sep 30, 2011 at 12:19 PM, Vlad Riscutia <report@bugs.python.org> wrote:

    I believe this is the better thing to do rather than detailing how GCC and MSVC allocated their bitfields because that would just
    encourage people to use this feature incorrectly.

    So clearly documenting how a feature works will cause people to use
    the feature incorrectly? I think not. In any case, I agree that
    documenting the low-level specifics of each compiler's algorithm is too much.

    Most bugs opened on bit fields are because people are toying with the underlying buffer and get other results than what they expect.

    The issues that I have looked at (bpo-6069, bpo-11920, and
    bpo-11920) all involve fundamental misunderstandings of *how* the
    structure layout is determined. I don't know if I would generalize
    these misunderstanding as "toying with the underlying buffer". Some
    times people need to know the exact layout for proper C interop. In
    some of the bugs reported folks are casting buffers in an attempt
    to discover the structure layout since it is not clearly documented.

    The general content of your patch seems reasonable. I will provide
    more specific comments shortly.

    @meadori
    Copy link
    Member Author

    meadori commented Oct 4, 2011

    Added some comments in rietveld. P.S. watch out for trailing whitespace
    when writing patches. Use 'make patchcheck' to help find bad whitespace
    formatting.

    @vladris
    Copy link
    Mannequin

    vladris mannequin commented Oct 4, 2011

    Thanks for the "make patchcheck" tip, I didn't know about that. I will update the patch soon.

    In the mean time, I want to point out a couple of things:
    First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect.

    Since this is platform dependent, they shouldn't worry about the raw memory as long as C interop works fine. Bitfield layout is complex as it involves both allocation algorithm and structure packing and same Python code will work differently on Windows and Unix.

    My point is that documenting all this low-level stuff will encourage people to work with the raw memory which will open the door for other issues. I believe it would be better to encourage users to stick to declaring members and accessing them by name as raw memory WILL be different for the same code on different OSes.

    Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building Python. It will still do GCC style allocation on Solaris.

    @meadori
    Copy link
    Member Author

    meadori commented Oct 4, 2011

    On Tue, Oct 4, 2011 at 10:21 AM, Vlad Riscutia <report@bugs.python.org> wrote:

    First, I'm saying "toying with the underlying buffer" because none of the bugs are actual issues of the form "I created this bitfield
    structure with Python, passed it to C function but C structure was different". That would be a bitfield bug. All of these bugs are people
    setting raw memory to some bytes, then looking at bitfield members and not seeing what they expect.

    Please qualify "all" instead of generalizing. I can point to two
    issues (bpo-11990 "I'm generating python code from real c code.",
    bpo-12945 "We have raw data packages from some tools. These packages
    contains bitfields, arrays, simple data and so on.") where C
    code or raw data was, in fact, involved and the reporters just don't
    understand what layout algorithm is being used. They may not need
    to know the specifics of the algorithm, but they *do* need to know if
    it matches the compiler they are using to do interop or the one
    that generated the raw data.

    The reason that we are seeing folks cast raw memory into a cyptes
    bitfield structure is because they do not understand how the structure
    layout algorithm works and are trying to figure it out via these
    examples.

    Second, one of your review comments is: "GCC is used for most Unix systems and Microsoft VC++ is used on Windows.". This is not
    how ctypes works. Ctypes implements the bitfield allocation algorithm itself, it doesn't use the compiler with which it is built. Basically
    it says #ifdef WIN32 - allocate like VC++ - #else - allocate like GCC. So it doesn't really matter with which compiler you are building
    Python. It will still do GCC style allocation on Solaris.

    I understand how it works. This quote is taken somewhat out of
    context as the preceding sentence is important. Perhaps saying GCC-
    style and VC++-style would have been more clear. The reason that I
    mentioned the compiler used to build Python is that it is an easy
    reference point and more times than not the bitfield allocation and
    layout *do* match that of the compiler used to build the interpreter.
    Anyway, I am fine with dropping the "used to build the Python
    interpreter" and going with something similar to what you originally
    had.

    Also, in general, the compiler used to build the ctypes extension
    *does* matter. Look in 'cfield.c' where all of the native alignments
    are
    computed at compile time. These alignments affect the structure
    layout and are defined by the compiler building the ctypes extension.

    @meadori
    Copy link
    Member Author

    meadori commented Oct 4, 2011

    Look in 'cfield.c' where all of the native alignments

    Well, not *all* the native alignments, but many of them.

    @vladris
    Copy link
    Mannequin

    vladris mannequin commented Oct 4, 2011

    I agree compiler matters for alignment but if you look at PyCField_FromDesc, you will see the layout is pretty much #ifdef MS_WIN32 - #else.

    Sorry for generalizing, "all" indeed is not the right word. My point is that we should set expectation correctly - VC++-style on Windows, GCC-style everywhere else and encourage users to access structure members by name, not raw memory. Issues opened for bitfields *usually* are of the form I mentioned - setting raw memory to some bytes then seeing members are not what user expected, even if ctypes algorithm works correctly.

    As I said, I will revise the patch and maybe make it more clear that users should look up how bitfield allocation works for their compiler instead of trying to understand this via structure raw memory.

    @terryjreedy
    Copy link
    Member

    If I understand correctly, this doc patch would apply to 2.7 and 3.2 also. I have two style comments. I believe

    "It is important to note that bit field allocation and layout in memory is not defined as a standard, rather its implementation is compiler-specific."

    could be shortened to

    "Bit field allocation and memory layout is compiler-specific."

    To me, this leads nicely into the proposed sentence that follows.

    "it is recommended that no assumptions are made about the structure size and layout."

    I do not like 'it is recommended'. Let us state the fact.

    "any assumptions about the structure size and layout may be wrong."

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    encukou added a commit to encukou/cpython that referenced this issue May 17, 2025
    Co-Authored-By: Meador Inge <meadori@gmail.com>
    @encukou
    Copy link
    Member

    encukou commented May 17, 2025

    You can now change the behavior with _layout_, and its docs talk about the default.
    I adjusted the patch to point the reader there: #134148

    encukou added a commit that referenced this issue Jun 6, 2025
    Co-authored-by: Meador Inge <meadori@gmail.com>
    Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jun 6, 2025
    (cherry picked from commit b22b964)
    
    Co-authored-by: Petr Viktorin <encukou@gmail.com>
    Co-authored-by: Meador Inge <meadori@gmail.com>
    Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
    @encukou encukou closed this as completed Jun 6, 2025
    encukou added a commit that referenced this issue Jun 6, 2025
    …35216)
    
    (cherry picked from commit b22b964)
    
    Co-authored-by: Petr Viktorin <encukou@gmail.com>
    Co-authored-by: Meador Inge <meadori@gmail.com>
    Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir topic-ctypes type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants