-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py: Cast mp_obj_t to concrete types explicitly. #1166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ok, catching it while it hot - following comment #1161 (comment) . @dpgeorge , how do you like this? |
I was independently thinking about this in the following context: one may want to box types in a different way to the current implementation. Eg nan-boxing, or, more useful, having 64-bit objects on a 32-bit system, such that a pointer would take up only half the object. The reason for the latter would be to have 63-bit integers on a 32-bit system, and 31-bit integers on a 16-bit system. |
@@ -156,7 +156,7 @@ STATIC mp_obj_t list_subscr(mp_obj_t self_in, mp_obj_t index, mp_obj_t value) { | |||
// delete | |||
#if MICROPY_PY_BUILTINS_SLICE | |||
if (MP_OBJ_IS_TYPE(index, &mp_type_slice)) { | |||
mp_obj_list_t *self = self_in; | |||
mp_obj_list_t *self = MP_OBJ_CAST(self_in); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If MP_OBJ_CAST is something like (self_in+4) and self_in is never referenced again in this function, then the compiler may optimise it such that self reuses the stack/register that self_in previously used. Thus, no pointer to the original self_in exists any more and the object may be GC'd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is bigger issue, as mentioned in #1161. The general direction of handling it should be that GC is not async, but may happen only in specific checkpoints (like when allocations is called). And past those checkpoints, MP_OBJ_CAST should be called again. And yes, at the worst case, we will need to add (configurable) volatile there. But let's wait first till it happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this code example here does exactly a sync allocation between CAST and UNCAST! (list_new calls m_new_obj). There are going to be lots of non-obvious bugs introduced by such a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is null (CAST and UNCAST will be identity for the current codebase of course). This change is intended to allow to experiment with other allocation/gc schemes and identify issues with them, instead of speculating what they may be. It would be nice to have just drop-in patch which at once implements concurrent compacting gc, but I don't think that's realistic (and even if it happens - somebody does it as part of dayjob for example - it may turn out that such patch is not easily mergeable because reshuffles a lot of core code in unpredictable way). I also have a lot of experimental stuff locally, but it's at least clustered. These crumb-like changes are almost impossible to maintain in a branch, so I'd like to have them in mainline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may turn out that such patch is not easily mergeable because reshuffles a lot of core code in unpredictable way
I've had a thought about how to deal with this and related issues of configurability for a while, and now seems like a good time to mention it: ultimately I would rewrite uPy at a higher level so that it becomes a description of Python rather than an implementation. Then also write a translator to translate this description to C (or any other language). This allows to you describe exactly what you want to do (eg get a pointer for writing, critical section for GC, write barrier for GC, critical section for multiprocessing, global state access, etc etc) and would allow many implementations to be generated, each with their own compromises/features. It would allow to have highly configurable features, retain current implementation, and allow to experiment with lots of different ways of doing things. This is my dream :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a really neat idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my dream :)
Sounds nice for a dream, but this would be really long way. And all efficient things appear to be hand-crafted. Also, isn't PyPy does just that - a description of language (not necessarily Python) which can be parametrized, and for which efficient interpreter/compiler are generated (taking several gigabytes and dozen of minutes in the process).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To start with such a meta-python doesn't have to be anything more than the current uPy source code which is preprocessed, with the preprocessing doing almost nothing to begin with. Then we add things as needed. Eg we can resolve #704 by translating all external API names in the processor. And we can add a first argument to all functions that is the global state. Sooner or later we need to do these things and a preprocessor is the only way I can think of doing it properly.
Yes, there're definitely can be many uses for it. I had in mind exactly the opposite example of yours - to have 16-bit mp_obj_t for small-heap 32-bit systems. And double pointer indirection movable heap. So, I assume the general scheme is ok and naming is sane too. I then proceed further (not going to convert all in once, but gradually proceed file by file will give warm feeling of progress ;-) ). |
Nice idea, it would save a lot of RAM on MCUs with <64k heap, but then how do you "point" to ROM objects? You would need some flag in the 16-bit mp_obj_t to indicate it's a ROM object, and then you'd need a lot of magic to enumarte the ROM objects...
I was thinking of using Note that we might also need separate macros for ROM versions of ptrs, eg |
No idea so far, would need to experiment ;-).
Let's sync this with what's discussed in #1161. So, we need to pairs of macros - one to work with objects, another - with "raw memory". And they will appear very often, so would be nice to have them shorter rather than longer. I thought about MP_OBJ_CAST/MP_OBJ_UNCAST & MP_RAW_CAST/MP_RAW_UNCAST. |
Coverage increased (+0.0%) to 90.07% when pulling 89221c57d0312742833a4fd88a9d663e638c4d02 on pfalcon:mp-obj-cast into 69922c6 on micropython:master. |
Ok, so this now compiles and represents complete conversion of objlist.c. Completeness was tested by defining mp_obj_t to struct foo*. A typesafety error was found during conversion: 69922c6 Note that these patches assume that memory allocation functions return mp_obj_t, which hopefully very reasonable. But then m_new_obj() and friends should have "prototype" return mp_obj_t, which cannot be done now due to need to include obj.h into misc.h, which is layering violation. So, I propose (as one of next steps) to move those funcs either to obj.h, or to a separate header like alloc.h. That step is also on critical path for #1161 too. I left macro namings per my original proposal and per argument in #1166 (comment) . @dpgeorge , feel free to insist them to be changed, but then please provide naming for "raw memory" accessor counterparts. |
@@ -644,6 +644,16 @@ typedef double mp_float_t; | |||
#define MP_PLAT_FREE_EXEC(ptr, size) m_del(byte, ptr, size) | |||
#endif | |||
|
|||
// Cast mp_obj_t to object pointer | |||
#ifndef MP_OBJ_CAST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these should be here. They should be in obj.h. They are not for a port to configure, but rather for uPy itself to configure based on what object model is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but rather for uPy itself to configure based on what object model is being used.
... which is itself might be configured by port. Of course, there's a long way to GC method being configurable per port, though that can be start.
Other reason I put it to mpconfig.h was that it's wrapped in #ifndef. Are you ok to have them in obj.h wrapped like that?
Coverage increased (+0.0%) to 90.08% when pulling 24e4a64ddd015ac2a83310b49d9e834bcfbb4ff7 on pfalcon:mp-obj-cast into 3cc17c6 on micropython:master. |
mp_obj_t internal representation doesn't have to be a pointer to object, it can be anything. There's also a support for back-conversion in the form of MP_OBJ_UNCAST. This is kind of optimization/status quo preserver to minimize patching the existing code and avoid doing potentially expensive MP_OBJ_CAST over and over. But then one may imagine implementations where MP_OBJ_UNCAST is very expensive. But such implementations are unlikely interesting in practice.
Ok, macros moves to obj.h. Hope the patch is ok for merge now. |
The problem with all this is that it has to also work with ROM objects. Given an "object" (of type mp_obj_t) it could be a small int, a qstr, a handle to RAM, or a handle to ROM. Handles to ROM will pretty much always have to be a direct pointer, due to linker limitations. That means that in general your MP_OBJ_PTR_VALUE macro must distinguish between RAM and ROM and extract the pointer accordingly. And you also will need to have separate constructors for ROM objects, eg: MP_OBJ_NEW_ROM_SMALL_INT, MP_OBJ_NEW_ROM_QSTR and MP_OBJ_NEW_ROM_PTR. To test this out I tried making mp_obj_t a union of a double and a void* on a 32-bit system. And indeed you get lots of nasty error messages, eg, comparing unions, initialising them as ROM data, etc. So at this point it's not clear that the approach we have is general enough for realistic applications, at least those beyond something as simple as (ptr + 4). |
Well, linker limitation is not CPU architecture limitation, and per the latter, "ROM handle" can be as anything as anything else. For example, index in an indirection table. Or all ROM objects could be packed to together, and with 32K heap, both of them can be referenced with 16-bit mp_obj_t. Or whatever.
You didn't explicitly requested that name to be used, so there's MP_OBJ_CAST, not MP_OBJ_PTR_VALUE.
Yes, override macro for mp_obj_t which doesn't allow to reference all address space needs to do that, yes. But there're no such macros so far.
That's out of scope of this patch (and general proposed transformation).
Yes, as was discussed, this patch is just first step on the way which was thought about a year ago, but had no progress. This patch's aim is to get away from flat mess o'pointers to a more structured way, where "object handles" are differentiated from "object pointers" on source code level. The first are stored inside objects, the second are used to actually access objects in memory. This patch allows to handle some scenarios of when handle and pointer have different representation, not all of them. More advanced scenarios will require more patches. But it's pretty clear that any advancement in the area of memory management depends on differentiating handles and pointers, and that's exactly what this patch approaches. |
Ok, hope all these arguments are convincing, merging. Macro naming can always be tweaked later. |
Right, but linker limitation is a real-world limitation. Eg some archs allow
This is probably doable, but would need something like mechanism we're using for qstrs to make it work without help from the linker.
There was never anything decided, it was still an open discussion! And I get confused by your notation: to me MP_OBJ_CAST means
But you need to think about how to deal with this: string and dict/map objects have pointer to raw data, and ROM versions of these objects are created in the code, so how is that going to work?
Agreed. But the ROM issue is just as important as the RAM issue IMHO. |
Well, good news is that now changing that is just matter of blind search and replace. Whereas adding annotation is matter of pretty careful code review and not exactly trivial transformations. A week before this, I spent a lot of time reading GC papers (with emphasis on defragmenting heap). By now I forgot most of the details. If we had such annotations before, I could instead play with some ideas. So, I forgot details of those advanced GCs, but remembered that we need those annotations. But I could forgot that too if not proceed with it (and I wanted to play with gc a year ago, and had bunch of various ideas over that time, most forgotten.
Logic is very simple: macros which deal with mp_obj_t have MP_OBJ_ prefix. Then "forward" operation is certainly converting mp_obj_t to concrete object. That's why it's called "cast". The opposite op is thus called "uncast". All that governed by requirement of being short. But well, looking from outside, you certainly see some things clearer, so if you think longer, but more descriptive names would be better, ok, let's go for them. |
Yes. but it's separate and orthogonal issues. As I said at the beginning, I don't look to provide a complete solution, I look towards making few steps in the right direction, with hundreds more required for some "complete solution". |
Ok! I'll also see what it takes to get a full solution. My aim is for 64-bit objects on 32-bit system. |
Reorder status matrix and module list in shared-bindings for increase…
mp_obj_t internal representation doesn't have to be a pointer to object,
it can be anything.
There's also a support for back-conversion in the form of MP_OBJ_UNCAST.
This is kind of optimization/status quo preserver to minimize patching the
existing code and avoid doing potentially expensive MP_OBJ_CAST over and
over. But then one may imagine implementations where MP_OBJ_UNCAST is very
expensive. But such implementations are unlikely interesting in practice.