Skip to content

Optimize bytes-like (l|r)strip #4500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 17, 2023
Merged

Conversation

dannasman
Copy link
Contributor

Hi!

Here is a solution proposal to #4497. I was not quite sure what the optimization part of #4496 was (I assumed it was the if-else part).

@DimitrisJim DimitrisJim linked an issue Feb 14, 2023 that may be closed by this pull request
Comment on lines +369 to +373
fn lstrip(
zelf: PyRef<Self>,
chars: OptionalOption<PyBytesInner>,
vm: &VirtualMachine,
) -> PyRef<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found bytes.lstrip has different issue to str.lstrip. zelf.inner.lstrip already returns Vec<u8>, which means unnecessary copy happened.
I think this duplication check need to be done inside zelf.inner.lstip. otherwise zelf.inner.lstip also could return &[u8] instead of Vec<u8>, but it may not be easy due to lifetime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If the zelf.inner.lstrip returns &[u8] is it possible to turn &[u8] to Vec<u8> without copying the values or is there a implementation for turning &[u8] directly to PyBytesInner? Because otherwise zelf.inner.elements is unnecessarily copied in the case of duplication.

Copy link
Contributor Author

@dannasman dannasman Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could wrapping the return value of zelf.inner.lstrip in Option be an option? In that case it would return None on duplication and Some(stripped) would be returned in other cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyBytesInner is a wrapper around Vec<u8>, so allocation is inevitable. Fortunately, converting a Vec<u8> into a PyBytesInner only takes ownership of that Vec<u8>.

Copy link
Contributor

@fanninpm fanninpm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I don't think we need the explicit conversion in these two places, as PartialEq<&[U]> is implemented for Vec<T, A>.

vm: &VirtualMachine,
) -> PyRef<Self> {
let stripped = zelf.inner.lstrip(chars);
if stripped == zelf.as_bytes().to_vec() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
if stripped == zelf.as_bytes().to_vec() {
if stripped == zelf.as_bytes() {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think &stripped == zelf.as_bytes() works, but still stripped is Vec<u8>.

vm: &VirtualMachine,
) -> PyRef<Self> {
let stripped = zelf.inner.rstrip(chars);
if stripped == zelf.as_bytes().to_vec() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if stripped == zelf.as_bytes().to_vec() {
if stripped == zelf.as_bytes() {

@dannasman dannasman requested review from youknowone and fanninpm and removed request for youknowone and fanninpm February 17, 2023 08:22
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, Thank you!

@youknowone youknowone merged commit a0c34da into RustPython:main Feb 17, 2023
@dannasman dannasman deleted the bytes_lrstrip branch February 17, 2023 20:27
itsankitkp pushed a commit to itsankitkp/RustPython that referenced this pull request Feb 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize bytes-like (l|r)strip
3 participants