Skip to content

bpo-7940: add support for negative end positions to re.finditer and re.findall #14744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

anilbey
Copy link
Contributor

@anilbey anilbey commented Jul 13, 2019

add support for negative end positions to re.finditer and re.findall

https://bugs.python.org/issue7940

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

You can check yourself to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

Copy link
Member

@isidentical isidentical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Modules/_sre.c Outdated
@@ -427,11 +427,15 @@ state_init(SRE_STATE* state, PatternObject* pattern, PyObject* string,
}

/* adjust boundaries */
if (start < 0)
start += length;
if (start < 0)
start = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In re.compile('.').findall("abcd", -10, -3), the first if will set start to -6 which is not a valid index, and the second if is necessary because it will set it to 0. There are currently no tests for this scenario, so I would add a couple.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@rhettinger rhettinger self-assigned this Jul 13, 2019
@anilbey
Copy link
Contributor Author

anilbey commented Jul 14, 2019

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@ezio-melotti: please review the changes made to this pull request.

@rhettinger
Copy link
Contributor

I've looked at this again and am still not inclined to accept it.

If someone else really believes this is a necessary API expansion and wants to move it forward, I won't object.

@csabella
Copy link
Contributor

csabella commented Feb 4, 2020

@ezio-melotti should this be merged or should there be further discussion?

@anilbey anilbey force-pushed the bpo7940 branch 2 times, most recently from a749312 to a7b0605 Compare July 16, 2022 09:50
@anilbey
Copy link
Contributor Author

anilbey commented Jul 16, 2022

Just rebased at another EuroPython sprint after 3 years :)

@anilbey
Copy link
Contributor Author

anilbey commented Jul 16, 2022

Quick summary

  • the current implementation accepts negative indices but silently truncates them to 0, which I doubt is the expected behaviour
  • this change allows negative indices to be used the same way they work in other collections
  • it enables consistency with the regex module
  • however it introduces a behavioural change

Copy link
Member

@isidentical isidentical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving the overall PR, but we need a regex expert on this @ezio-melotti @serhiy-storchaka

@serhiy-storchaka
Copy link
Member

This is a breaking change. A FutureWarning should be emitted for 2 releases prior to changing the behavior.

@anilbey
Copy link
Contributor Author

anilbey commented Jul 18, 2022

This is a breaking change. A FutureWarning should be emitted for 2 releases prior to changing the behavior.

Hi @serhiy-storchaka is that something to be done in the code of this PR? How can we emit a FutureWarning?

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@anilbey anilbey force-pushed the bpo7940 branch 3 times, most recently from 73147f2 to b3dfb5a Compare March 15, 2023 10:00
@anilbey
Copy link
Contributor Author

anilbey commented Mar 15, 2023

@serhiy-storchaka @iritkatriel I have made the requested changes; please review again. :)

@bedevere-bot
Copy link

Thanks for making the requested changes!

@iritkatriel, @ezio-melotti, @isidentical: please review the changes made to this pull request.

@anilbey
Copy link
Contributor Author

anilbey commented Mar 15, 2023

@isidentical since your last review I only added the FutureWarnings.

@anilbey
Copy link
Contributor Author

anilbey commented Aug 16, 2023

Future Warning is emitted in #107105

@dg-pb
Copy link
Contributor

dg-pb commented Jun 27, 2024

Maybe it would be worth having global macro for this?
There is one in unicodeobject.c:ADJUST_INDICES, one in bytes_methods.c:ADJUST_INDICES and the same is done here. The same could be used when dealing with sequence indices too.

@anilbey
Copy link
Contributor Author

anilbey commented Jun 27, 2024

Hi @dg-pb,

Thank you for your reply. This PR has been open for about 5 years now. Are you a core developer, and would you be interested in seeing this change merged?

@dg-pb
Copy link
Contributor

dg-pb commented Jun 27, 2024

Hi @dg-pb,

Thank you for your reply. This PR has been open for about 5 years now. Are you a core developer, and would you be interested in seeing this change merged?

I am not. Just decided to go through old PRs and decided to offer my thoughts for this one. Hopefully someone familiar with re can resolve whether to close or implement this.

@python python deleted a comment Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.