Skip to content

Document explicit speedup approaches. #1994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

akaariai
Copy link

I've implemented a couple of ideas to speed up rest framework. The most important idea is to memoize situations where simple getattr calls can be used instead of the full blown get_attribute. This patch also does slight speedup to serializer.to_representation(), and allows using non-sorted output from serializers.

The test case I used was rendering 100000 simple objects with ten simple fields to a JSON list. I am using ensure_ascii when rendering, as that is fastest option.

Current master renders the data to JSON in 24s, while version-3.0 renders it in around 8.2s. When using the patched version with ordered output, the run speed is around 5.1s, and with unordered output, the speed is around 3.1s. All combined the speedup is 8x compared to master. The patched version is 2.6x faster than version-3.0. Dumping the data directly without a serializer takes around 0.9s.

The biggest drawbacks of the used approach are:

  • added complexity
  • not guaranteed to work if users do runtime monkey patching
  • it is possible that you have an instance where an attribute is sometimes an callable, and sometimes not. Something like: data = [Obj(some_attr=datetime.now if i > 1 else None) for i in range(0, 10)] would not work with the current implementation. Notably caching is_simple_callable is impossible for such data, and most of the speedup comes from bypassing is_simple_callable checks.

While the patch passes all tests the work is at proof of concept stage.

After the changes the time is mostly consumed in iterating the fields, calling get_attribute, and then calling field.to_representation(). There isn't any obvious extra work done any more. The top lines of the profile are:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.190    0.190   12.452   12.452 tester.py:1(<module>)
        1    0.000    0.000    9.875    9.875 serializers.py:126(data)
        1    0.150    0.150    9.875    9.875 serializers.py:495(to_representation)
   100000    3.258    0.000    9.721    0.000 serializers.py:415(to_representation)
  1000000    2.469    0.000    3.574    0.000 fields.py:254(get_attribute)
1003256/1003228    1.109    0.000    1.109    0.000 {getattr}
   100000    0.541    0.000    0.924    0.000 tester.py:32(__init__)
   800000    0.917    0.000    0.917    0.000 fields.py:517(to_representation)
   100000    0.483    0.000    0.898    0.000 fields.py:820(to_representation)
        1    0.000    0.000    0.817    0.817 renderers.py:81(render)
        1    0.041    0.041    0.790    0.790 __init__.py:185(dumps)
        1    0.000    0.000    0.749    0.749 encoder.py:180(encode)
   100000    0.622    0.000    0.726    0.000 serializers.py:167(__init__)

The test script used was:

import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'django_test.settings'
from django import setup
setup()
from datetime import datetime
import json

from rest_framework import serializers
from rest_framework.renderers import JSONRenderer

class CommentSerializer(serializers.Serializer):
    email = serializers.EmailField()
    content1 = serializers.CharField(max_length=200)
    content2 = serializers.CharField(max_length=200)
    content3 = serializers.CharField(max_length=200)
    content4 = serializers.CharField(max_length=200)
    content5 = serializers.CharField(max_length=200)
    content6 = serializers.CharField(max_length=200)
    content7 = serializers.CharField(max_length=200)
    content8 = serializers.CharField(max_length=200)
    date = serializers.DateField()

    sorted_output = False


class Comment(object):
    def __init__(self):
        self.content1 = u'foo'
        self.content2 = u'foo'
        self.content3 = u'foo'
        self.content4 = u'foo'
        self.content5 = u'foo'
        self.content6 = u'foo'
        self.content7 = u'foo'
        self.content8 = u'foo'
        self.email = u'a@b.fi'
        self.date = datetime.now().date()

comments = [Comment() for i in range(0, 100000)]
start = datetime.now()
json.dumps([
    {
        'content1': getattr(c, 'content1'), 'email': getattr(c, 'email'),
        'date': getattr(c, 'date').isoformat(),
        'content2': getattr(c, 'content2'),
        'content3': getattr(c, 'content3'),
        'content4': getattr(c, 'content4'),
        'content5': getattr(c, 'content5'),
        'content6': getattr(c, 'content6'),
        'content7': getattr(c, 'content7'),
        'content8': getattr(c, 'content8'),
    } for c in comments])
print(datetime.now() - start)
start = datetime.now()
serializer = CommentSerializer(comments, many=True)
class FastJSONRenderer(JSONRenderer):
    ensure_ascii = True

FastJSONRenderer().render(serializer.data)
print(datetime.now() - start)

@xordoquy
Copy link
Collaborator

Thanks for the work.
I'll try to look at this during the week. The general trend on the project is to keep thing as clear and straight forward as possible.
I'll try to see if we can make those optimizations easy to plug if one needs / wants them.

@akaariai
Copy link
Author

The main speedup comes from 1) ability to use unsorted output (unfortunately Sorted/OrderedDicts are slow), and 2) skipping is_simple_callable() checks. The latter is harder to support in a clean and completely safe way.

@tomchristie
Copy link
Member

Echoing @xordoquy here, yup. Almost certainly not a trade-off we'd consider, but certainly worth investigating anything that can make it simpler to do this as a third party serializer base class.

To expand on this slightly...

  • In any situation where the data is backed by a data store of any kind the performance increases certainly won't be anything like as they appear here.
  • If performance is critical I would rather take the approach of keeping the base classes as simple as possible so that it's in turn easier for the dev to override. In this type of case I'd advocate that what we should instead be doing is showing the developers via good documentation how they can override the entire to_representation method explicitly in order to bypass the normal machinery.

Eg...

class CommentSerializer(serializers.Serializer):
    email = serializers.EmailField()
    content1 = serializers.CharField(max_length=200)
    content2 = serializers.CharField(max_length=200)
    content3 = serializers.CharField(max_length=200)
    content4 = serializers.CharField(max_length=200)
    content5 = serializers.CharField(max_length=200)
    content6 = serializers.CharField(max_length=200)
    content7 = serializers.CharField(max_length=200)
    content8 = serializers.CharField(max_length=200)
    date = serializers.DateField()

    def to_representation(self, obj):
        """
        Serialization speed of these objects is critical, so we're hardcoding this...
        """
        return {
            'email': obj.email,
            'content1': obj.content1,
            ...
            'date': obj.date.isoformat()
        }

We'd also want to document up the advanced usage of how and when you'd want to use a sorted dict,, or the ReturnDict class (Allows you to support renderers that require backlinks to extra field information. When used it turns on HTML forms in the browsable API.)

@tomchristie
Copy link
Member

Closing. Very much appreciate the input, but not quite in the direction I'd like us to take things for the reasons outlined above. Don't take that as meaning that we can't discuss this or similar ideas further, but keeping scope and direction on-track in the meantime.

@xordoquy
Copy link
Collaborator

@tomchristie Is there a way we can keep track of this work ? I'd like to keep it on our radars.

@tomchristie
Copy link
Member

@xordoquy Happy for us to open a documentation ticket as detailed above, or role it into #1976.

Anything else depends on where @akaariai wants to take this. If there are API hooks that we ought to be adding into the serializer class in order to support these sorts of modifications then we can consider those on a case-by-case basis.

@akaariai Apologies also for my directness wrt to closing tickets off, not at all any comment on the worthiness on the patch, I've just been maintaining the project long enough now I think we've got a good feel for what helps and what harms the long-term maintenance and end-user usability of the project, and I'm very strongly in favor of us optimising for simplest-possible wherever possible.

@akaariai
Copy link
Author

My work was mainly trying to check where DRF uses extra time that could be easy to shave off. So, +1 to closing this :)

While the benchmark presented wasn't too realistic, I currently have a need for a faster serializer than what DRF can offer. When just reading objects from DB and serializing them the Serializer class is a bottleneck even in version-3.0. Of course, we aren't talking about huge changes here

Another idea could be adding a couple of built-in to_representation functions (or just documenting them).

# Add to Serializer:
    def obj_to_representation(self, obj):
          return dict([(f.field_name, f.to_representation(getattr(obj, f.field_name)))
                            for f in self.fields.values()])

    def dict_to_representation(self, d):
          return dict([(f.field_name, f.to_representation(d.get(f.field_name)))
                            for f in self.fields.values()])

# Now, a faster PeopleSerializer:

class PeopleSerializer(Serializer):
    to_representation = Serializer.obj_to_representation

Then, document how to write a faster to_representation() method, and point out that DRF includes these two convenience methods you can use directly. As a bonus these methods gives a runtime of 1.9s compared to 3.1s of this PR's version (or almost 5x faster than version-3.0). See akaariai@b2336f0 for this idea.

@tomchristie
Copy link
Member

Strongly in favour of documenting something like this...

def obj_to_representation(self, obj):
      return dict([(f.field_name, f.to_representation(getattr(obj, f.field_name)))
                        for f in self.fields.values()])

Not sure about pulling the source into the package tho - rather call it out very explicitly as an example, but force folks to deal with any of the subtleties of the limitations it introduces themselves. Pulling it into core we'd almost certainly get 'but it doesn't work in XYZ case' etc...

This would also be a good conversation to have around where we introduce BaseSerializer...

https://github.com/tomchristie/django-rest-framework/blob/version-3.0/docs/topics/3.0-announcement.md#the-baseserializer-class

Noting that you don't always need to be using BaseSerializer in order to get big performance improvements.

Reopening this ticket, and re-naming as documentation-focused.

@tomchristie tomchristie reopened this Oct 29, 2014
@tomchristie tomchristie changed the title A couple of speedup ideas Document explicit speedup approaches. Oct 29, 2014
@tomchristie
Copy link
Member

Oh wait, not correct, given that this was a PR. I'll open as a seperate issue instead. Sorry for the noise.

@akaariai
Copy link
Author

I implemented quickly a way to use non-ordered dicts, see akaariai@2ee4cdc. There shouldn't be any cave-eats with the approach except of course your output will be non-ordered. See the commit message for how to use the approach.

If this approach seems like something you'd like to support, then I am happy to open a separate pull request for that. This could also be implemented as a mixin outside DRF if the _dict_class attribute is available.

BTW If this PR isn't appropriate place for discussing further ideas I am happy to continue this discussion somewhere else. Also, thanks for your fast and informative responses to this PR!

@alvinchow86
Copy link

I'm also seeing slowness issue with OrderedDict (#2454), but related to pickling. I like the customizable _dict_class in akaariai@2ee4cdc; I wonder if it might be a good idea to allow a global setting (in settings.REST_FRAMEWORK) as well

@tomchristie
Copy link
Member

I wonder if it might be a good idea to allow a global setting (in settings.REST_FRAMEWORK) as well

I wouldn't currently be in a favor of that. For the large majority of cases DB access is going to be the limiting factor, so I'd prefer for this to just be an advanced bit of API limited to the serializers.

@alvinchow86
Copy link

I think this is still a good thing to do (customizable _dict_class as in akaariai@2ee4cdc).

I'll try to get an updated patch (off of DRF 3.1), submit PR and maybe include some performance tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants