Skip to content

Implement PregGrepDynamicReturnTypeExtension #2560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

staabm
Copy link
Contributor

@staabm staabm commented Aug 2, 2023

main driver is eleminating a false return type, when a constant regex pattern is valid

https://www.php.net/manual/en/function.preg-grep.php

failling tests before PR: https://phpstan.org/r/cf5bed2b-490f-4769-af9e-611f28950c50

@phpstan-bot
Copy link
Collaborator

You've opened the pull request against the latest branch 1.11.x. If your code is relevant on 1.10.x and you want it to be released sooner, please rebase your pull request and change its target to 1.10.x.

@staabm staabm changed the base branch from 1.11.x to 1.10.x August 2, 2023 13:09
@clxmstaab clxmstaab force-pushed the preg_grep branch 3 times, most recently from 31d96db to 81f2ed3 Compare August 2, 2023 13:47
@staabm staabm marked this pull request as ready for review August 2, 2023 14:02
@phpstan-bot
Copy link
Collaborator

This pull request has been marked as ready for review.

@davebarrau
Copy link

@staabm Hi! 👋

In addition to an invalid regular expression returning false, there are a couple of edge case conditions where false can be returned where the compilation failed due to not being able to allocate the required memory, or some other weird internal error.

Is this worth worrying about? (I would suspect not, but I thought I'd ask!)

@staabm
Copy link
Contributor Author

staabm commented Aug 6, 2023

Hi. Thanks for the feedback.

Do you think these are situations we can detect? Could you provide a reproducing example?

We may use a benevolent union to allow handling of false, without errors.

@davebarrau
Copy link

Hi. Thanks for the feedback.

Do you think these are situations we can detect? Could you provide a reproducing example?

We may use a benevolent union to allow handling of false, without errors.

Hi,

No, based on the cursory glance I gave php-src & pcre, I don't think these would be detectable. They're more of an edge case run time thing. 99.99% of false returns I imagine would be from bad patterns.

Probably outside the scope of this PR, but could other preg_* functions benefit from this PR too? Quite a few of them have false as the error condition.

@staabm
Copy link
Contributor Author

staabm commented Aug 7, 2023

No, based on the cursory glance I gave php-src & pcre, I don't think these would be detectable. They're more of an edge case run time thing. 99.99% of false returns I imagine would be from bad patterns.

I don't think from a userland perspective "out-of-memory" errors can be handled.

Probably outside the scope of this PR, but could other preg_* functions benefit from this PR too? Quite a few of them have false as the error condition.

if you have a example please create a separate issue. AFAIR is this already the case for some preg_* methods

@staabm
Copy link
Contributor Author

staabm commented Aug 12, 2023

Per discussion in php/php-src#11927 (comment) it seems memory errors of the underlying regex libraries are not something which can be handled in php userland in a useful way.

Therefore I think this is good to go

@staabm
Copy link
Contributor Author

staabm commented Aug 13, 2023

Probably outside the scope of this PR, but could other preg_* functions benefit from this PR too? Quite a few of them have false as the error condition.

did a bit more research. I think we can do the same for preg_match, preg_match_all, preg_split.

https://phpstan.org/r/a538ad90-be19-4c67-a5ed-fbc573d4f888

please just give me a hint, whether I should integrate it here, or whether we should do followups

@staabm staabm force-pushed the preg_grep branch 4 times, most recently from d4c52c7 to 9f42601 Compare August 15, 2023 08:02
@staabm
Copy link
Contributor Author

staabm commented Aug 15, 2023

I looked thru the failling extension tests and think the new errors are valid/need to be baselined

@ondrejmirtes
Copy link
Member

There are too many errors in the integration test suite, I kind of doubt all of them are fault of the analysed code.

I'm not a regexp expert. But I'd say the false return value that's not a fault of an invalid regex can be pretty common. So requiring the user code to handle the false too is fine.

@staabm
Copy link
Contributor Author

staabm commented Aug 15, 2023

@Seldaek since you built a preg_* wrapping lib for composer, may I ask whether you are aware of a case where the preg-functions can return false even though a valid pattern was passed in?

in other words: are you aware of a case where it makes sense to handle a false returned from e.g. preg_match in php userland when a valid pattern was passed in.


the only valid case I could think of was a out-of-memory error in PCRE, but a php-src dev told me that php-src would also likely run out of memory in this case, because they share the same allocator

@staabm
Copy link
Contributor Author

staabm commented Aug 15, 2023

hmm I think I found a case https://3v4l.org/0vAeY

@staabm staabm closed this Aug 15, 2023
@Seldaek
Copy link
Contributor

Seldaek commented Aug 15, 2023

Yeah it can also fail due to the backtracking limit (this is from a test we have in composer/pcre) https://3v4l.org/M2cZeV and that is purely dependent on the subject.

@staabm staabm deleted the preg_grep branch August 15, 2023 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants