-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/runtime: Avoid crash on calling members of uninitialized native type. #9997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3cd1ccb
to
3543d9d
Compare
2333780
to
119f3b7
Compare
Updated now with a variant that does not fail the tests. I've also added a test to reproduce this (relies on But this variant only checks for this crash for lookups in locals_dict. It still needs some generalization to get this right for other lookups like attr, but without repeating the same check multiple times. |
Now that the timer test has been merged, the bug in question can be reproduced as follows: from cexample import Timer
class BadTimer(Timer):
def __init__(self):
print(self.time()) # Reads from arbitrary memory. More complex funcs will segfault.
super().__init__(self)
watch = BadTimer() I'm basically looking for the right (and minimal) place to ensure we don't attempt any bad lookups in the |
This adds a separate AdvancedTimer class that demonstrates a few more advanced concepts usch as custom handlers for printing and attributes. Signed-off-by: Laurens Valk <laurens@pybricks.com>
119f3b7
to
005dcec
Compare
Code size report:
|
I think this is ready for review now. I have expanded the native class example with a few more features. This is useful for the example in itself, but also helps build the test case for this PR. This ensures we can also test that calls to A few considerations:
|
When subclassing a native type, calling native members in `__init__` before `super().__init__()` has been called could cause a crash. In this situation, `self` in `mp_convert_member_lookup` is the `native_base_init_wrapper_obj`. This check ensures that an AttributeError is raised before this happens, which is consistent with other failed lookups. Also fix a typo in a related comment. Signed-off-by: Laurens Valk <laurens@pybricks.com>
This adds a CPython diff that explains why calling super().__init__() is required in MicroPython when subclassing a native type (because __new__ and __init__ are not separate functions). Signed-off-by: David Lechner <david@pybricks.com>
005dcec
to
6566cff
Compare
Code size report:
|
#if MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG | ||
if (obj_obj == MP_OBJ_FROM_PTR(&native_base_init_wrapper_obj)) { | ||
// But we shouldn't attempt lookups on object that is not yet instantiated. | ||
mp_raise_msg(&mp_type_AttributeError, MP_ERROR_TEXT("call super().__init__() first")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be possible to just fix it and make it work like CPython (as much as that's possible without having both __new__
and __init__
in MicroPython) by doing this:
if (obj_obj == ...) {
native_base_init_wrapper(0, NULL);
obj_obj = obj->subobj[0];
}
That will call __init__
automatically and reload the newly-created native instance.
Then I wouldn't wrap it in MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG
, just do it unconditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if you do this, please don't remove the C example module or tests, they are good!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that segfault (null dereference)? I started digging based on your suggestion. Perhaps did you mean something like this? This seems to work, with no AttributeError required, so closer to CPython indeed.
diff --git a/py/objtype.c b/py/objtype.c
index a202b023b..f5100e239 100644
--- a/py/objtype.c
+++ b/py/objtype.c
@@ -170,12 +170,12 @@ STATIC void mp_obj_class_lookup(struct class_lookup_data *lookup, const mp_obj_t
if (obj != NULL && mp_obj_is_native_type(type) && type != &mp_type_object /* object is not a real type */) {
// If we're dealing with native base class, then it applies to native sub-object
obj_obj = obj->subobj[0];
- #if MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG
if (obj_obj == MP_OBJ_FROM_PTR(&native_base_init_wrapper_obj)) {
- // But we shouldn't attempt lookups on object that is not yet instantiated.
- mp_raise_msg(&mp_type_AttributeError, MP_ERROR_TEXT("call super().__init__() first"));
+ // Attempted lookup on uninstantiated native object, so instantiate now.
+ mp_obj_t args[] = {obj};
+ native_base_init_wrapper(1, args);
+ obj_obj = obj->subobj[0];
}
- #endif // MICROPY_BUILTIN_METHOD_CHECK_SELF_ARG
} else {
obj_obj = MP_OBJ_FROM_PTR(obj);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's right!
(We had a fairly long discussion around this today and wrapped up the conclusion a bit quickly :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the instant feedback. While updating the tests, I noticed this would cause make_new
to be called twice:
class SubTimer(AdvancedTimer):
def __init__(self):
# At this point, self does not yet represent a AdvancedTimer instance.
print(self) # <function>
print(self.time()) # <--- with proposed variant, this will trigger make_new instead of raising AttributeError
# Initialize base class.
super().__init__(self) # <---- But then this calls make_new again
watch = SubTimer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behavior is still a bit odd (and different from CPython). Take this slightly more elaborate class, which takes a parameter to __init__
:
from pybricks.parameters import Port
from pybricks.iodevices import PUPDevice
class NewSensor(PUPDevice):
def __init__(self, port):
# This would now try to call make_new under the hood, but it expects
# an argument, so raises TypeError: 'port' argument required
print(self.info())
# Never gets here.
super().__init__(port)
sensor = NewSensor(Port.A)
I suppose I would OK with that (anything that doesn't segfault is fine), but I just wanted to make sure. We can add an updated CPython difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed exactly this scenario... It makes sense to me that this should fail, but I agree that the error message is confusing because it looks like info()
is missing the port
argument.
What's the CPython behaviour that you'd expect here? (I'm struggling to think of an example where this could come up with a built-in type. It is different to the behaviour of a non-native base of course)
Note there's another CPython diff here which is that if you don't explicitly call super().__init__
and you have a native base, then by default we pass the derived type's __init__
args to the native base. (CPython only does this when the base is tuple
and frozenset
, perhaps others).
class MyList(list):
def __init__(self, a):
pass
class MyTuple(tuple):
def __init__(self, a):
pass
print(MyList([1,2,3])) # CPython: []. MicroPython: [1,2,3]
print(MyTuple([1,2,3])) # Both: (1,2,3)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MiroPython's make_new
is really the equivalent of __new__
without __init__
in CPython. This is why immutable types like tuple
have the different behavior as above - since they are immutable, all of the "init" has to be done in __new__
. So I don't think we should be calling make_new
more than once, otherwise it forces all implementations to have to be aware of which call it is (if first call treat as __new__
, otherwise treat as __init__
) or risk leaking resources or other unexpected behaviors (e.g. if make_new
opens a file descriptor, then is called again, that file descriptor has to be closed, otherwise it is leaked with no way to close it later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we implement something like this?
- If
self
is accessed in an__init__
overload of a native type andsuper.__init__()
was not called yet, then callmake_new
with the args of the overload. - If
super().__init__()
is called aftermake_new
was called because of the condition above, raise an exception. - Otherwise, if
super.__init__()
is called before something triggered the call tomake_new
, then callmake_new
with the args fromsuper.__init__()
. - Finally, if nothing triggered
make_new
before the__init__
overload returns, callmake_new
with the args from the overload.
The recommendation in the MicroPython docs should be to call super().__init__()
as the first line of any __init__
overload of a native type (or at least before using self
). But at least in other cases, it might work as expected or at least not segfault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That proposal just above does seem to cover all points raised here.
But it makes things complicated: implementation, documentation, behaviour, and what the user needs to learn and understand.
I think the simplest thing is to just prevent the crash if trying to access super before its initialised, and the user can initialise the native type it early if needed. That's much easier to understand.
(Also I don't even know how we would implement the first bullet point above, the part where the native make_new was called with the args of the overload, because those args are not available at that point in the runtime.)
This is an automated heads-up that we've just merged a Pull Request See #13763 A search suggests this PR might apply the STATIC macro to some C code. If it Although this is an automated message, feel free to @-reply to me directly if |
In the interest of making progress and protecting against the crash that still exists, I will merge this PR. It's a simple fix for the problem. We can revisit it with a better solution in the future if needed. |
Rebased, fixed STATIC->static renaming, and merged in 19b1333 through d1bf0ee Thanks @laurensvalk and @dlech for your efforts on this. |
Thank you! |
…3_lcd_1_44_fix bootloop fix for spotpear esp32c3 1.44 lcd board
When subclassing a native type, calling native members in
__init__
beforesuper().__init__()
has been called could cause a crash. In this situation,self
inmp_convert_member_lookup
is thenative_base_init_wrapper_obj
. This check ensures that a TypeError is raised when this happens.Also fix a typo in a related comment.
This is an attempt at fixing the following crash in Pybricks MicroPython
EDIT: This isn't quite the right solution as seen in the tests, so I'll have to have another look.
This variant appears to work without breaking the other tests, but still needs some cleaning up: