-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Optimize bytes-like (l|r)strip #4500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fn lstrip( | ||
zelf: PyRef<Self>, | ||
chars: OptionalOption<PyBytesInner>, | ||
vm: &VirtualMachine, | ||
) -> PyRef<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found bytes.lstrip has different issue to str.lstrip. zelf.inner.lstrip already returns Vec<u8>
, which means unnecessary copy happened.
I think this duplication check need to be done inside zelf.inner.lstip
. otherwise zelf.inner.lstip also could return &[u8]
instead of Vec<u8>
, but it may not be easy due to lifetime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. If the zelf.inner.lstrip
returns &[u8]
is it possible to turn &[u8]
to Vec<u8>
without copying the values or is there a implementation for turning &[u8]
directly to PyBytesInner
? Because otherwise zelf.inner.elements
is unnecessarily copied in the case of duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could wrapping the return value of zelf.inner.lstrip
in Option
be an option? In that case it would return None
on duplication and Some(stripped)
would be returned in other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyBytesInner
is a wrapper around Vec<u8>
, so allocation is inevitable. Fortunately, converting a Vec<u8>
into a PyBytesInner
only takes ownership of that Vec<u8>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? I don't think we need the explicit conversion in these two places, as PartialEq<&[U]>
is implemented for Vec<T, A>
.
vm/src/builtins/bytes.rs
Outdated
vm: &VirtualMachine, | ||
) -> PyRef<Self> { | ||
let stripped = zelf.inner.lstrip(chars); | ||
if stripped == zelf.as_bytes().to_vec() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
if stripped == zelf.as_bytes().to_vec() { | |
if stripped == zelf.as_bytes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think &stripped == zelf.as_bytes()
works, but still stripped is Vec<u8>
.
vm: &VirtualMachine, | ||
) -> PyRef<Self> { | ||
let stripped = zelf.inner.rstrip(chars); | ||
if stripped == zelf.as_bytes().to_vec() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if stripped == zelf.as_bytes().to_vec() { | |
if stripped == zelf.as_bytes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, Thank you!
Hi!
Here is a solution proposal to #4497. I was not quite sure what the optimization part of #4496 was (I assumed it was the if-else part).