-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[DomCrawler] Fix memory leak with filter method #11221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
$crawler->filter('tr > td'); | ||
} | ||
|
||
$after = memory_get_usage(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to get an accurate estimation of the memory used, you should call gc_collect_cycles
before memory_get_usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, but gc_collect_cycles
also solve (partialy) the circular reference memory leak. I don't use it, because I want the test prove that there is a issue when GC is not called,
@@ -61,8 +61,10 @@ public static function toXPath($cssExpr, $prefix = 'descendant-or-self::') | |||
->registerParserShortcut(new ClassParser()) | |||
->registerParserShortcut(new HashParser()) | |||
; | |||
$resp = $translator->cssToXPath($cssExpr, $prefix); | |||
$translator->destroy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An other solution to solve this meomry leak, is to replace this line by a simple.
if (gc_enabled()) {
gc_collect_cycles();
}
Wich solution do you think is the better ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the garbage collector sounds like a good option when available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relyinbg on the garbage collector would indeed be better (calling destroy
on a translator puts it in a non-working state, so I would prefer avoiding to add it in the API)
@@ -85,6 +85,21 @@ public function __construct(ParserInterface $parser = null) | |||
} | |||
|
|||
/** | |||
* Resets local properties to there initial values | |||
*/ | |||
public function destroy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset()
sounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldnt this just be a destructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you mean __destructor
, the answer is no, because the __destructor is not called due to circular reference.
If you suggest the name destructor
, I'm not aggre, because this method does not destruct anything. @fabpot is right, reset
is better name.
@jeremy-derusse could you |
@apfelbox In this article http://alexatnet.com/articles/optimize-php-memory-usage-eliminate-circular-references In our case
with call to |
gc_collect_cycles() only collects circular references. So it is useless if you don't have circular references. PHP will clean variables when they become out of scope, automatically |
|
@jpauli the issue here is precisely that the object graph of the CssSelector translator is cyclic |
@jpauli here is the circular reference: https://github.com/symfony/symfony/blob/2.3/src/Symfony/Component/CssSelector/XPath/Translator.php#L79 Thank you @nicolas-grekas, I miss this point in my benchmark. Regarding those arguments, it's not a good idea to use |
@jeremy-derusse exposing reset() in the translator is not a good diea either, because it puts it in a non-working state. The right fix would be to refactor the logic to eliminate the circular reference. I have an idea for this but I don't have time to implement it just now. I will probably have to wait until tomorrow evening at least |
see #11242 for my own attempt at fixing it, without altering the external API of the Translator and without putting it in an invalid state |
…cular object graph (stof) This PR was merged into the 2.3 branch. Discussion ---------- [CssSelector] Refactored the CssSelector to remove the circular object graph | Q | A | ------------- | --- | Bug fix? | yes | New feature? | no | BC breaks? | no | Deprecations? | no | Tests pass? | yes | Fixed tickets | #10879, replaces #11221 | License | MIT | Doc PR | n/a This allows the translator and its extensions to be garbage collected based on the refcount rather than requiring the garbage collector run, making it much more likely to happen at the end of the ``CssSelector::toXPath`` call. Node translators now receive the Translator as second argument, instead of requiring to inject it in the extension to keep a reference to it. This way, the Translator is referenced nowhere inside it, only by the caller, and so will be destructed at the end of the usage (and extensions will then be destructed after it when not used anymore). Commits ------- 994f81f Refactored the CssSelector to remove the circular object graph
Add a public method
destroy
on Translator to reset properties and free memory