Skip to content

[Serializer] Reduce class discriminator overhead #28889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 20, 2018
Merged

[Serializer] Reduce class discriminator overhead #28889

merged 1 commit into from
Oct 20, 2018

Conversation

fbourigault
Copy link
Contributor

@fbourigault fbourigault commented Oct 16, 2018

Q A
Branch? 4.1
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #28537
License MIT
Doc PR N/A

This try to fix the overhead added by class discriminator feature.

Here is a 4.1 vs this PR comparison:
https://blackfire.io/profiles/compare/20ead249-0e98-430f-9b8d-906f464b1854/graph

And a 3.4 vs this PR comparison:
https://blackfire.io/profiles/compare/7e402dde-4a54-4053-a12e-d3d6891afc02/graph

@javiereguiluz
Copy link
Member

@fbourigault this looks very promising. Thanks!

A quick question: why are performance improvements massive for 4.1 and minimal for 3.4?

@fbourigault
Copy link
Contributor Author

The 3.4 vs this PR comparison is here to show that we are almost back at 3.4 performance level (the class discriminator feature was introduced in 4.1).

public function __construct(ClassMetadataFactoryInterface $classMetadataFactory = null, NameConverterInterface $nameConverter = null, PropertyAccessorInterface $propertyAccessor = null, PropertyTypeExtractorInterface $propertyTypeExtractor = null, ClassDiscriminatorResolverInterface $classDiscriminatorResolver = null)
private static $isDiscriminatorAttributeCache = array();

public function __construct(ClassMetadataFactoryInterface $classMetadataFactory = null, NameConverterInterface $nameConverter = null, PropertyAccessorInterface $propertyAccessor = null, PropertyTypeExtractorInterface $propertyTypeExtractor = null, ClassDiscriminatorResolverInterface $classDiscriminatorResolver = null, callable $objectClassResolver = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed rebase 😅

@@ -30,7 +30,9 @@ class ObjectNormalizer extends AbstractObjectNormalizer
{
protected $propertyAccessor;

public function __construct(ClassMetadataFactoryInterface $classMetadataFactory = null, NameConverterInterface $nameConverter = null, PropertyAccessorInterface $propertyAccessor = null, PropertyTypeExtractorInterface $propertyTypeExtractor = null, ClassDiscriminatorResolverInterface $classDiscriminatorResolver = null)
private static $isDiscriminatorAttributeCache = array();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make this property static, because injected resolver can be different form an instance to another.

}
}

return $this->propertyAccessor->getValue($object, $attribute);
return self::$isDiscriminatorAttributeCache[$cacheKey] ? $this->classDiscriminatorResolver->getTypeForMappedObject($object) : $this->propertyAccessor->getValue($object, $attribute);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would even improve the performance more: for a given object, regardless of the property, the result of getTypeForMappedObject will always be the same.
We could use SplObjectStorage to call this method only one time for every object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we would have two caches, one to know if the current property we are getting the value is the discriminator field of the class and an other one to get the value of this discriminator mapping.

Does the later worth it? It would probably be used only when the same object is present more than once in the serialized objects graph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAttributeValue is called for every attribute of a given class, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

Copy link
Contributor Author

@fbourigault fbourigault Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a v2 0bd95ec and the improvement is less significant: https://blackfire.io/profiles/compare/fd66194c-e3a7-4bc1-bea4-c67d2677148f/graph

IMHO we need two levels of cache, one to known if a class has a discriminator, and one to get the property name (roughly what is done in v2).

This will improve the case we have a lots of the same class in the graph and reducing the number of calls to one per object.

WDYT?

EDIT: v2 isn't working. Tests are broken. Will give a look later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny change to make the code simpler to read.

@fbourigault
Copy link
Contributor Author

What about storing in a cache the ClassDiscriminatorResolverInterface::getMappingForClass, ClassDiscriminatorResolverInterface::getMappingForMappedObject and ClassDiscriminatorResolverInterface::getTypeForMappedObject results using the class name as key? With a decorate we should be able to reduce the calls to ClassMetadataFactoryInterface to only one per class. This could be done in an other PR.

@dunglas
Copy link
Member

dunglas commented Oct 17, 2018

@fbourigault sounds good to me

@fbourigault
Copy link
Contributor Author

The v2.1 of this PR use much more memory than the v1.

4.1 vs #28889 v1 : https://blackfire.io/profiles/compare/6cf63a44-65ef-4fb6-90e2-7de049c846fc/graph
4.1 vs #28889 v2.1 : https://blackfire.io/profiles/compare/00c4aa38-97d3-41c3-928c-84c5e909cdde/graph

I think this is because the benchmark use a lot of the same class objects which would mimic a collection serialization.

This may require profiling with more objects of different type (more like a single object serialization).

@dunglas
Copy link
Member

dunglas commented Oct 17, 2018

Can you try if using an associative array with the result of spl_object_hash() as key, instead of an SplObjectMap reduces the memory footprint?

@fbourigault
Copy link
Contributor Author


if (null !== $mapping && $attribute == $mapping->getTypeProperty()) {
return $this->classDiscriminatorResolver->getTypeForMappedObject($object);
$cacheKey = spl_object_hash($object);
Copy link
Member

@nicolas-grekas nicolas-grekas Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works:

  • this creates a memory leak since the cache array is never cleared
  • but more importantly, object hashes can be reused so that a cached value can map to another object

Both issues could be fixed by clearing this cache in the appropriate place (when exiting any recursive normalization steps)
On PHP 7.2, we should use spl_object_id() btw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both issues could be fixed by clearing this cache in the appropriate place (when exiting any recursive normalization steps)

So in the Serializer class. A bit intrusive. (we'll have to introduce a new interface, and call a clearCache() method if the normalizer implements this interface.

Another solution (cleaner) would be to do the resolving in the method calling getAttributeValue(), and pass the result as a new argument of getAttributeValue().
Because it's a protected method, will have to deprecate not passing this new argument.

Copy link
Contributor Author

@fbourigault fbourigault Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using spl_object_id (with the polyfill for PHP 7.1) for the 2nd point?

For the first point, is it a viable solution to add some property to the $context to keep track of the nested calls and then clear this the cache when exiting the outer call?

//ObjectNormalizer.php
    public function normalize($object, $format = null, array $context = array())
    {
        try {
            $context['discriminator_cache_depth'] = $context['discriminator_cache_depth'] ?? 0 + 1;

            $result = parent::normalize($object, $format, $context);

            --$context['discriminator_cache_depth'];

            return $result;

        } finally {
            if (0 === $context['discriminator_cache_depth']) {
                $this->discriminatorCache = array();
            }
        }
    }

Profiling is not that bad (https://blackfire.io/profiles/compare/cf712867-392c-45ee-a3ff-221e0ba72a01/graph). It reduce the class discriminator overhead (3.4 vs master) from 53.9 to 18.6%.

An other solution would be to ship this only with 4.2 and implement the ResetInterface.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResetInterface doesn't work as explained: handles can be reused between calls.
The spl_object_id polyfill adds overhead. Since we're tracking perf here, better not.
I'll review iteration v3 :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using the polyfill, this PR would still be faster than the current 4.1 state (I'm trying to profile using php 7.1, but I got weird results because of DateTimeZone::getTransitions: https://blackfire.io/profiles/6c0105f5-bf95-49d3-a49d-25adc16cd6fc/graph)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dunglas what about going back to v1 slightly improved to store the discriminator attribute per class and call ClassDiscriminatorResolverInterface::getMappingForMappedObject only when getAttributeValue is called when $attribute === $this->discriminatorCache[\get_class($object)]?

This would avoid using spl_object_id/spl_object_hash and the cache will have a finite size and will be reusable between requests.

This would improve perfs a lot, because, we are only computing the cache value once per class (instead of object) and ClassDiscriminatorResolverInterface::getMappingForMappedObject will be called only once per object which is involved in a class discriminator.

I will give this a try and call it v4 ;)

@fbourigault
Copy link
Contributor Author

fbourigault commented Oct 18, 2018

I just pushed v4, which use get_class($object) as cache key. This is working because we were already storing the name of the discriminator field into the cache and of course, it's always the same for a given class.

This is the best improvement we can do as it totally kills the class discriminator overhead!

@nicolas-grekas
Copy link
Member

Thank you @fbourigault.

@nicolas-grekas
Copy link
Member

Thank you @fbourigault.

@nicolas-grekas nicolas-grekas merged commit 326c267 into symfony:4.1 Oct 20, 2018
nicolas-grekas added a commit that referenced this pull request Oct 20, 2018
…ult)

This PR was merged into the 4.1 branch.

Discussion
----------

[Serializer] Reduce class discriminator overhead

| Q             | A
| ------------- | ---
| Branch?       | 4.1
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | #28537
| License       | MIT
| Doc PR        | N/A

This try to fix the overhead added by class discriminator feature.

Here is a 4.1 vs this PR comparison:
https://blackfire.io/profiles/compare/20ead249-0e98-430f-9b8d-906f464b1854/graph

And a 3.4 vs this PR comparison:
https://blackfire.io/profiles/compare/7e402dde-4a54-4053-a12e-d3d6891afc02/graph

Commits
-------

326c267 [Serializer] Reduce class discriminator overhead
@fbourigault fbourigault deleted the discriminator-perfs branch October 21, 2018 13:00
@sroze
Copy link
Contributor

sroze commented Oct 21, 2018

Thank you @fbourigault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants