Skip to content

Mpl traitlets #4694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
Closed

Mpl traitlets #4694

wants to merge 9 commits into from

Conversation

tacaswell
Copy link
Member

No description provided.

@tacaswell tacaswell added this to the proposed next point release milestone Jul 14, 2015
}
allow_none = True
info_text = 'float, int, tuple of float or int, or a hex string color'
default_value = (0.0,0.0,0.0,0.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is transparent; perhaps you mean to set alpha to 1, not 0.

@OceanWolf
Copy link
Member

I have had a quick look through, mainly in the traitlets and its test. I like the direction it heads in terms of defining a colour type, but I can't see what the external traitlets package adds...

I have checked on my system and I don't seem to have it installed, so at the moment it feels like installing something extra that I don't really need and something that also (imo) adds to the difficultly level to understand the code.

Perhaps though I have a bias as I just opened an MEP to improve begin doing something similar ;).

@dopplershift
Copy link
Contributor

Traitlet's give:

  • validation of assigned values for traits
  • notification of value changes
  • easy hooking up Traits together
  • easy serialization of trait-ed properties

And for a significant portion of the user base, it's already installed since they have IPython/Jupyter.

So this paves the way for cleaning up a lot of the property code, making it easy to serialize MPL objects. This also eases the way we can interact with IPython widgets, making for improvements in the NBAgg backend.

@rmorshea
Copy link
Contributor

@AndreLobato was also involved in this effort during the sprints at SciPy.

@OceanWolf
Copy link
Member

All for cleaning up the property code, but we have other simpler ways to do that as I put forward in #4693.

  • Notification of value changes also easy, we have that already and can get simplified as I put forward in the MEP to only one line of code.
  • Validation, easy we don't need an entire library just to do that, it just lies in the validation function.
  • Serialisation, for the most part serialisation works just fine, only due to the backends have I seen us needing to write code to work around it, and we don't want to serialise the backends.
  • Not sure what you mean by "hooking up Traits together", as I have no clue about Traits.

On the plus side, I have just found out that I do have traitlets installed, but under IPython.utils.traitlets, I think I missed it before because it was commented out above. But I think that's beside the point, do we really want to force people to download ipython just to use MPL? incorporating it into such a core piece of API, it seems like overkill.

Also I have just done a web search for "python traitlets tutorial" and found nothing (at least on the first page of google) which makes me wonder about the learning curve to use it and the wider support it has.

I don't know, I just have a really uneasy feeling about sticking a non-core package all over the codebase, and from what I see in this PR it looks very messy.

@efiring
Copy link
Member

efiring commented Jul 14, 2015

@OceanWolf traitlets is now an independent package: https://pypi.python.org/pypi/traitlets/
Serialization: all of our serialization is now based on pickle, which has all sorts of problems for everything except temporary and localized use. I haven't looked at how traitlets affects serialization, but it won't involve pickles.
Apart from that, I have always shared your caution about adopting something like traitlets; many years ago we considered the heavier Enthought traits package, and after some experimentation, we decided not to use it. Traitlets looks much less dangerous; but I don't have a good measure of its cost/benefit ratio.

@dopplershift
Copy link
Contributor

My takeaway from SciPy was that getting a dictionary of all "traits" within the tree of objects in a figure, etc. would be "trivial" (i.e. almost free) It's important to look at this through the lens not of a technology that someone set out a priori to create (I share the hesitation about Enthought's Traits), but rather traitlets is the solution the IPython devs found themselves in need of in order to manage configs, event notification, etc.

It's easy enough to start from scratch and cook up solutions to event notification, validation, etc. using "just" python properties and descriptors. The problem is, 3 years later after solving the issues that crop up and extending to all the natural use cases, you'll have succeeded in making (yet another) version of traitlets.

@rmorshea
Copy link
Contributor

@OceanWolf Aside from the trait change handlers, type checking, serialization, and streamline of the API, this will allow easy integration with IPython's configurables (see this blog post) and with widgets. Whether or not the dependency is worthwhile? That seems like an open question.

However, in the 4.0 release of IPython, the project will be repackaged under the Jupyter label. This means that many of its various components (e.g. widgets and traitlets) will be divided up into independent products.

@rmorshea
Copy link
Contributor

@dopplershift

obj.traits(serialize=True)

would return all TraitType objects registered with the metadata serialize=True in dictionary form.

@OceanWolf
Copy link
Member

Okay, one more question before I go away and read this stuff up. I don't understand the difference between the different trait packages. We have Enthought's, iPython's and now a new stand alone package. I presume the standalone package exists as a repackaged version of iPython's version, so that leaves two different packages. I don't want to learn about the wrong package.

The blog post linked to above talks about using Enthought's traitlets, not IPython's. I also started reading this before I realised I read about the wrong traitlets.

@WeatherGod
Copy link
Member

everyone is migrating towards the standalone traitlets package.

On Tue, Jul 14, 2015 at 5:27 PM, OceanWolf notifications@github.com wrote:

Okay, one more question before I go away and read this stuff up. I don't
understand the difference between the different trait packages. We have
Enthought's, iPython's and now a new stand alone package. I presume the
standalone package exists as a repackaged version of iPython's version, so
that leaves two different packages. I don't want to learn about the wrong
package.

The blog post linked to above talks about using Enthought's traitlets, not
IPython's. I also started reading this
http://code.enthought.com/projects/traits/docs/html/traits_user_manual/intro.html#what-are-traits
before I realised I read about the wrong traitlets.


Reply to this email directly or view it on GitHub
#4694 (comment)
.

@rmorshea
Copy link
Contributor

@OceanWolf Enthought's traits and IPython's traitlets essentially exist to solve the same problem, however IPython's version is lightweight in comparison. I'm not too familiar with Enthought's version, but I believe that a fair portion of it is compiled, whereas IPython's is pure python that's built on __set__ and __get__.

@OceanWolf
Copy link
Member

Okay, but in terms of API they have no difference? They look exactly the same? So one can read an Enthought tutorial and then use IPython's version and not notice any difference?

@WeatherGod
Copy link
Member

@OceanWolf, no, Enthought's Traits is not the same as traitlets. They
should not be confused. Traits is very old (~15 years old, IIRC). traitlets
is very new.

On Tue, Jul 14, 2015 at 5:41 PM, OceanWolf notifications@github.com wrote:

Okay, but in terms of API they have no difference? They look exactly the
same? So one can read an Enthought tutorial and then use IPython's version
and not notice any difference?


Reply to this email directly or view it on GitHub
#4694 (comment)
.

@OceanWolf
Copy link
Member

Okay, I have a better understanding of traits and afraid to say I still don't feel convinced, at least not for MPL. For applications that process form data, yes, but not plotdata.

It just feels very unpythonic and unintuitive (same thing)...

  1. Instance properties get defined as class properties. This feels wrong. very strange compared with the rest of python.
  2. Too strict, the end of duck-typing.
  3. Easy to make typos as the following no longer throws an error (see example below)
  4. Multiple Inheritance becomes messy when combining with other classes which uses metaclass (unavoidable, especially with backends)
  5. Lots of extra classes that we need to import for basic types like Integer etcetera...
from IPython.utils import traitlets

class MyTraitClass(traitlets.HasTraits):
  a = traitlets.Integer(default_value=4)

class MyNormalClass(object):
  def __init__(self, a=4):
    self.a = a

# Lets make a typo
foo = MyTraitClass(d='hello')  # ignores bad argument and assigns foo.d = 'hello'
bar = MyNormalClass(d='hello')  # throws error allowing us to find our typo

Also @dopplershift we don't start from scratch, we already have a fully working event system for instance, and we have our validation working nicely on our rcParams. We just have some refactoring to do, which started in MEP22 with the toolbar, and MEP27 with the Windowing system, and now moving onto Artists (with MEP28?). Though of course I may well change my mind once I have gotten a bit further with MEP28, but now I firmly believe with good design we can do it the pythonic way keeping it sweet and simple, it shouldn't take too long, only an a couple of hours once I start.

@dopplershift
Copy link
Contributor

I'm not sure what your point is: you created one class with an __init__ method and one without; without traitlets you'd get the exact same behavior with regards to the d parameter. However, with the example above:

foo.a = 'baz' # Checks bad argument and throws an error
bar.a = 'baz' # Silently works

I'd rather have the behavior of foo. Yes, we have the rcparams for validation of our config files. But there is little run-time validation; instead you get figure generation failure well after the time that the parameters were set. Sometimes, this results in really hard to diagnose errors.

Now how about this:

import traitlets
class Line(traitlets.HasTraits):
    linewidth = traitlets.Float(default_value=1.5, help='width of the line')
    # New Color trait coming
    color = traitlets.Any(default_value='black', help='color of the line') 

l1 = Line()
l2 = Line()

# Link color traits together
lnk = traitlets.link((l1, 'color'), (l2, 'color'))
print(l1.color, l2.color) # "black black"
l1.color = 'blue' # This triggers update of l2's color
print(l2.color) # "blue"

I see clean declaration of properties, validation of the properties within the instance, and easy linking of traits between instances (which could be any traitlets object). I wouldn't exactly go trumpeting our (ab)use of keyword arguments as a bastion of clean, pythonic code. Yes, you can write the same code in matplotlib, without traitlets. I've done it, but it's much more verbose.

I'm no fan of Enthought's version of traits (having attended far too many SciPy tutorials on them, with nothing to show for it); traitlets doesn't feel the same, partially (I think) because it's still perfectly fine to use garden-variety attributes and properties. You can opt in for the attributes where it makes sense, and you still have the rapid development without working hard at upfront design of attribute types.

I also strongly believe, like you, in the importance of duck typing. But do we really want to ducktype our colors? Or linewidth, markertype, or any of a billion string parameters? And there does exist the Any trait, if that's what you really want; or just using plain attributes for stuff you don't want to notify on or serialize.

@OceanWolf
Copy link
Member

In the example I gave above I do get different behaviour as I wrote in the comments, normal python gives me an error when I mistype keyword arguments, Traitlets doesn't. Yes we could also add an __init__.

With your first example I prefer foo.a = 'baz' silently working because duck-typing works like that and forms part of the fundamental basis of python, it works on trust, fast and simple, all this extra checking will no doubt slow down matplotlib, especially when you have a very large dataset. We have an api, if the user finds they can do something cool by passing in an string instead of an int, then fine.

Sure I do see some advantages of using trailets, like colours, but I see too much pain for too little.

In your second example, you write linewidth and color as class variables, when they belong to the instance, so not Line.color, but l1.color, this adds for confusion imo when trying to track down problems. It also pushes everything into one big heap at the top which I find looks ugly and becomes difficult to read in the long run if you have lots of properties. As I said before, nice for record data from forms, where you have nothing more than a list of properties, but for more fluid classes like Artists with methods and complicated relationships interacting between, then ick.

Then comes the phage-esque behaviour of traitlets. super spreads like a virus if you use it (and we do), you have to use it everywhere, and if I understand correctly, we need to ensure that HasTraits appears as the last class on the MRO before object, otherwise I expect things will break. This phage like quality means we will also run into problems with metaclass conflicts as I mentioned above which adds extra ugly workaround code to deal with.

Finally with your linking example, I imagine while this looks nice at first, with large plotting operations, I see this quickly becoming a tangled mess. We already have plans for Dynamic Artist Style Sheets to function much like CSS for websites, and as someone who has worked in web-development, that feels like a by far superior solution to this allowing for much more fine grained control.

Yes, I feel we need a large refactor of the Artist code, but at the moment I think we can do better than what traitlets gives us.

@dopplershift
Copy link
Contributor

Seems we have a fundamental disagreement here. I'll let others weigh in as to their opinion.

@rmorshea
Copy link
Contributor

For now I sort of just dove into the deep end and tried to make everything into traitlets just to see if this was at all feasible. Whether or not all of these things should be traitlets is probably debatable; in certain cases where a simple property with getter/setter would suffice, traitlets might be like swatting a fly with a baseball bat. With respect to the baggage that comes along with validation, for attributes that are sufficiently hidden from the average user or that may be handling large datasets, Any doesn't do any type checking.

@rmorshea
Copy link
Contributor

@OceanWolf can you direct me to some of the metaclasses that you think might cause problems?

@rmorshea
Copy link
Contributor

Granted I'm new to the code base, but I didn't notice any in a brief skim of matplotlib.backends, though I've found some elsewhere in things like matplotlib.axes._subplots.

@tacaswell
Copy link
Member Author

Most of the gui widgets come in as meta classes (ex PyQt). This just means that we need to keep traitlets out of the canvas/renderer trees. I am not worried about that as the details of the GUI code are not something we should be serializing out anyway and most users do not ever know those classes exist.

I am not surprised there is some funny stuff in _subplots. That whole chunk of the code may need to be overhauled (and it probably needs it anyway). Pulling Axes (and maybe Figure) out of the Artist class hierarchy (maybe to be a sibling to Artist with a common ancestor) is not something I am a priori opposed to.

@OceanWolf
Copy link
Member

@rmorshea Yup, as @tacaswell said. Also ABC (Abstract Base Classes) for instance uses metaclasses, they validate that a class has all the methods it needs to work. I don't think we use them now, but @tacaswell initially suggested we use them in several places (like in the backends), but so far we have always had reasons not too.

Anyway, I really have found a strong dislike to metaclasses because they gunk up code when you want to combine them, requiring ugly workarounds, so I would prefer to stick to a zero-metaclass rule unless absolutely necessary (like in the backends), to keep our code future proof.

@dopplershift I don't think we have yet reached a complete impasse just yet. I see the way forward lies in information collating. I see this as quite a large change in direction of MPL and thus I think it deserves an MEP to ensure we discuss all the issues. In the MEP we can collate a list of advantages/disadvantages, go over alternatives (perhaps with links to code examples to illustrate points), and also begin work on the areas of commonality between the approaches (if we find them). Does that sound like a good idea?

I initially started an MEP (#4693) to cover Artist refactor, which we can use for this purpose, I need to renumber it though as MEP28 already exists as the semi-graphical-representation of subplots (it wasn't in master yet so I didn't see it when I did a skim). I have also started a branch, which may serve as a first step towards whatever we decide.

@tacaswell
Copy link
Member Author

@rmorshea re your last commit title line: Sweet!

@OceanWolf This has been an on-going discussion for about 6-9 months now (which is unfortunately smeared across a range of issue/PR/mailing list threads). The main things we need are

  • validation: This does not actually cost us duck typing in the sense that 'float like' means 'it can be cast to a float'. As we can define custom traitlets we can now specify the exact kind of duck we are looking for. There is value to deciding this as late as possible, but I think that on assignment to the artist object is the right time. To steal some verbiage from MVC, the artist objects are the models and when we call draw we are creating a view. The contract with the draw methods is that given a valid model they will give you a correct view, thus you should do the validation at the time when you would be creating the invalid model. Traitlets solve this problem in an elegant way. Once you started to implement validation by 'hand' you would end up writing the same code over and over, by using a meta-class all of that boiler plate code is centralized making the code is made more readable
  • serialization : pickle is not a valid option for serialization. It is no good for inter-language communication and is a massive security hole in python. We need to have a set of import/export commands both to a) have a way to save the figure that can be reloaded b) communicate with other plotting libraries (think bokeh, plotly, d3) bi-directionally which gives mpl an easy path to snazy web graphics and those libraries an easy path to high quality vector/raster graphics. Once you start to do this you will want to have programmatic access to the important properties of that artist which, again Traitlets has solved in an elegant way with the meta class

@tacaswell
Copy link
Member Author

@rmorshea I think it is failing on the doc builds because you need to install traitlets in that case too

@rmorshea
Copy link
Contributor

Makes sense, there's also some errors that are popping up when I run the tests locally as well though.

@OceanWolf
Copy link
Member

@tacaswell I think the fact that this has been "smeared across a range of issue/PR/mailing list threads" outlines the need for an MEP so as to collate all of this information, what works, what doesn't work, etcetera. I think I also need to see some coded examples (links to) as to what we want to see work, in the "Detailed description" section of an MEP.

@tacaswell
Copy link
Member Author

@rmorshea Actually travis is lying due to an other bug, all of the tests failed on import errors https://travis-ci.org/matplotlib/matplotlib/jobs/71156444 (the 'module not found' error like that is because we white list test, but the test_* modules fail on import (presumably because traitlets is missing) but nose swallows those exceptions. Nose then raises that very unhelpful exception telling you that one of the white listed modules does not exist.

@WeatherGod
Copy link
Member

I would also like to see a MEP that collates the knowledge we have gleaned through this PR. I see this PR as more of a prototype/mockup than something finalized. The MEP should then be created outlining the problems to be solved, and how traitlets solves them, with some useful examples.

@WeatherGod
Copy link
Member

By the way, there are a bunch of trailing spaces showing up in this PR.

@mdboom
Copy link
Member

mdboom commented Jul 16, 2015

I think @tacaswell's MVC analogy makes a lot of sense. The very delayed validation matplotlib currently is the source of many bug reports that could be more efficiently addressed through just having better and earlier error messages. We can expect this change to introduce a flood of "I used to pass this value that worked kinda by accident and now it doesn't work", but I think that's fine. And as @tacaswell points out, we may actually make some things more inclusive by converting strings to floats where we currently just assume it's a float.

I'll add a couple of "qualitative" desirements to @tacaswell's requirements of "validation" and "serialization": low performance overhead (still an open question, I guess) and improved code readability (across the code base -- I don't mind ugliness if it's all contained in one place in order to make everything else look nice). I think this PR works on all counts so far -- but I wouldn't mind considering alternatives as long as that doesn't hold up the works too much. However, it does seem from this PR that traitlets seems like a pretty good fit for what we're trying to achieve here (modulo some details that are probably better left until this is further along).

As for performance: @rmorshea, do you have any thoughts about how to measure the performance impact of these changes? Obviously, we're all comfortable with some overhead, but we should ensure it's not eggregious. Since this affects Artist (and thus pretty much everything), even just the time it takes to run the unit tests might be a reasonable proxy for the overall overhead. But it also might be useful to just see how %timeit Artist() compares with this vs. master.

@rmorshea
Copy link
Contributor

I'll leave this branch up, but I will be working from a new branch soon. I'm going take a different approach to the workflow.

@NeilGirdhar
Copy link

I'm just scanning this thread now as I'm eagerly awaiting traitlets.

@OceanWolf you may be happy to know that metaclasses have been getting a lot of positive discussion on python-ideas: https://groups.google.com/forum/#!topic/python-ideas/To6i84fK4nw I am also reluctant to use them because of how hard it is to combine them.

@OceanWolf
Copy link
Member

Yay :).
In the meantime have a possible idea for later down the refactor process, basically AFAIK, generally speaking one can use decorators instead of metaclasses, anyway, I have a rough idea of how to implement it (now I understand decorators), so a bit later down the line, if we still want to do something like this, I will knock up an example.

@NeilGirdhar
Copy link

I'm glad you're happy about that discussion. If you come up with an idea of how to tackle mixing metaclasses, you should definitely revive the thread on python-ideas.

That said, I'm not convinced by your concerns about traitlets:

  1. Instance properties get defined as class properties. This feels wrong. very strange compared with the rest of python.

I also have a traits system in my code, and do roughly the same thing. Logically, the trait types are shared between all class instances, so logically they should be class attributes.

  1. Too strict, the end of duck-typing.

You can always use traitlets.Instance or traitlets.Any.

My understanding is that the traitlets are taking common all of the promises that an object makes so that rather than having the coercing and serialization code duplicated everywhere, that code is in one place and you make the type promise once. One of the things I love about Python is that you can use the regular language constructs to abstract this away. (In C++, I think traits end up being a mess of templates.)

  1. Easy to make typos as the following no longer throws an error (see example below)

Good point. Is that the desired behaviour that other people want? Why not raise on an attribute that's not a class trait?

  1. Multiple Inheritance becomes messy when combining with other classes which uses metaclass (unavoidable, especially with backends)

Do traits inherit from metaclasses? If not, then how do traitlets cause any problems?

  1. Lots of extra classes that we need to import for basic types like Integer etcetera...

Why does this matter?

Like I said before, I'm biased, because I think traitlets would make my proposed rewrite of fancy arrows and edges easier to implement.

@rmorshea
Copy link
Contributor

link to new branch : #4762

@tacaswell tacaswell closed this Aug 1, 2015
@rmorshea rmorshea mentioned this pull request Aug 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants