Skip to content

Any concerns with using variable-size type structures? #188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pfalcon opened this issue Jan 17, 2014 · 12 comments
Closed

Any concerns with using variable-size type structures? #188

pfalcon opened this issue Jan 17, 2014 · 12 comments
Labels
rfc Request for Comment

Comments

@pfalcon
Copy link
Contributor

pfalcon commented Jan 17, 2014

As we progressing with functionality, there come up cases where you'd want to store more data/methods for particular classes. It makes no sense to try to stuff everything into mp_obj_type_t. Then, natural way is to "subclass" mp_obj_type_t, which in C translates to embedding mp_obj_type_t into another structure and adding members to the latter.

Any concerns with such approach?

@dpgeorge
Copy link
Member

You mean add methods/entries to mp_obj_type_t?

I agree this struct should be kept minimal. We should use sub-structs for groups of methods that are related (eg file ops, arith ops).

I don't think though that we need to add that much. At some point this struct will freeze and we shouldn't need to add things to it.

subclassing is okay, but what exactly was the use case you were thinking of? Couldn't we instead just put the relevant methods in the methods entry? I know it's a bit slower than having a dedicated member in the struct, but if it's for little-used functions, why not?

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 18, 2014

what exactly was the use case you were thinking of?

I have 2 usecases in mind: collections.namedtuple and pin protocol. For namedtuple, it's not even method - it's array of field names. So, I was thinking about:

struct namedtuple_obj_type_t {
    mp_obj_type_t base;
    char ** fields;
};

For pin objects, it would be cute to define C-level interface, which will allow to provide no-nonsense bitbang protocol implementation. Having pin methods as Python methods won't allow to have any (good) speed guarantees. And corresponding methods are too adhoc to put into mp_obj_type_t base.

struct pin_obj_type_t {
    mp_obj_type_t base;
    void (*low)();
    void (*high)();
    bool (*get)();
};

@dpgeorge
Copy link
Member

Couldn't this be done in the usual way by just creating a custom object, instead of a custom type?

For example mp_obj_tuple_t currently exposes some if it's functionality via C functions. If you know you've got a tuple, then you can call these special functions directly. Could such a simple mechanism be used for your cases?

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 18, 2014

Couldn't this be done in the usual way by just creating a custom object, instead of a custom type?

If you mean storing extra data inside objects instead of their types, yep, we could waste some bytes in each object by storing same data over and over again ;-). But the whole idea of namedtuple is that it takes minimal possible storage (less than dict for example), so one can create millions of named tuples (in particular, I think of they as backing storage for class slots - yes, we can have that too! ;-) ).

For example mp_obj_tuple_t currently exposes some if it's functionality via C functions. If you know you've got a tuple, then you can call these special functions directly. Could such a simple mechanism be used for your cases?

Depends on what these functions do. If they call virtual methods like shown above - then yes, and we need way to store such virtual methods ;-). If they don't, then nope, they don't cover everyone's usecases, because they don't offer polymorphism. It may seem for example that for particular MCU there can be only one type of pins (after all, there's only one MCU in there). But as soon as some GPIO extender is added, that idea crumbles. So, to supporting arbitrary object implementations polymorphism is required, and virtual methods is the most efficient known method to implement them.

@dpgeorge
Copy link
Member

Isn't this going outside the scope of Python, let alone a micro- Python implementation?

This stuff could be useful, I see that, but there are probably a lot of subtleties involved making it work properly. For example, telling the difference between a base type and a derived type object. It's trying to implement OO programming within C, and being compatible with the OO of the Python side. I guess if uPy used C++ this would be a non-issue?

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 18, 2014

Isn't this going outside the scope of Python, let alone a micro- Python implementation?

What exactly? Namedtuples are "extended core" part of Python (and common sense tells me they're useful to implement slots, which is core part of Python; note that while working on uPy, I don't look into CPython source (at least so far) - to both produce clean-room implementation and not pick up "bad" ideas).

And "protocols" aka C-level interfaces (wrapped in Python impl on top) are parts of both CPython and now uPy. And they actually more important for lean-resource implementation like uPy, that's why I'd like to think of general guidelines how to reuse it.

For example, telling the difference between a base type and a derived type object.

Supposedly that all should work automagically (once we fix corner issues) - you indeed made a great work to have unified type system which should be rather conformant to Python model.

It's trying to implement OO programming within C, and being compatible with the OO of the Python side.

Yes, and I don' see big difficulties with that - we already have beginnings of that in master, and while trying to leverage it more (which comes out great I'd say!), ideas how to extend it and use even more appear (like this ticket).

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 18, 2014

I guess if uPy used C++ this would be a non-issue?

I don't even was to start ranting on this ;-). One thing I really miss is LISP macros, which, as nobody uses LISP nowadays, got their new incarnation in C++ templates. We for example have almost complete method implementation for list type (working with mp_obj_t elements) - but they don't help at all with array.array's. With templates, we'd refactor it to support everything quickly and cleanly.

But note for example that case with namedtuple field names is case of polymorphic data (type->name is such too). C++ doesn't support that directly (it's way is "polymorphism is in behavior", i.e. there would be a method returning needed polymorphic data).

So, all in all, some subset of C++ would definitely help us (one which offer compile-time polymorphism). But for runtime polymorphism, we have more freedom the way it is now, we just should not be afraid of this freedom! ;-).

@dpgeorge
Copy link
Member

Okay, your arguments are convincing! I have no problem extending mp_obj_type_t in a dynamic way. In fact, this would help with making classes, because the entries bases_tuple and locals_dict in mp_obj_type_t are only relevant for class types (not other types).

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 19, 2014

Ok, so I'll be refactoring my namedtuple prototype along this way and then we can see if we can leverage it even for existing cases (what's sizeof of mp_obj_type_t?).

@dpgeorge
Copy link
Member

Cool. It might be useful to sub-class mp_obj_type_t to numbers, iterables, etc that have specific slots, thus avoiding proliferation of slots that not every type needs.

At the moment, mp_obj_type_t is at 18 words = 72 bytes on 32 bit machine.

@pfalcon
Copy link
Contributor Author

pfalcon commented Mar 4, 2014

namedtuple was implemented with such technique in pfalcon/pycopy@d08fd68

@pfalcon pfalcon closed this as completed Mar 4, 2014
@dpgeorge
Copy link
Member

dpgeorge commented Mar 4, 2014

Yes, I now think that it's a very good idea to subclass, as per namedtuple.

tannewt added a commit to tannewt/circuitpython that referenced this issue Aug 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Request for Comment
Projects
None yet
Development

No branches or pull requests

2 participants