Skip to content

rewrite pybytearray with pybyteinner #916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 6, 2019

Conversation

jgirardet
Copy link
Contributor

This is a rewrite of bytearray with pybyteinner.
Everything is the same as in #847 , except :

  • some refactoring for eq, gt, ge, le, lt and add to DRY
  • adapt clear, pop and append (which where already here)

Actual issues :

  • I'm struggling with get_value and get_value_mut. For now i'm just using a public inner and call directly borrow or borrow_mut (see changes in io.rs). I need some help on it.
  • the return type of append...

@codecov-io
Copy link

Codecov Report

Merging #916 into master will increase coverage by 0.2%.
The diff coverage is 78.64%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #916     +/-   ##
========================================
+ Coverage   64.19%   64.4%   +0.2%     
========================================
  Files          89      89             
  Lines       15229   15246     +17     
  Branches     3426    3434      +8     
========================================
+ Hits         9777    9819     +42     
+ Misses       3180    3154     -26     
- Partials     2272    2273      +1
Impacted Files Coverage Δ
vm/src/obj/objbytes.rs 77.41% <75%> (+3.94%) ⬆️
vm/src/stdlib/io.rs 51.91% <75%> (ø) ⬆️
vm/src/obj/objbytearray.rs 78.21% <78.06%> (+9.32%) ⬆️
vm/src/obj/objbyteinner.rs 69.95% <84%> (+1.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e578dcf...5864c6e. Read the comment docs.

@windelbouwman
Copy link
Contributor

windelbouwman commented May 4, 2019

I had another idea about this. We face several types:

  • str
  • bytes
  • bytearray

.. with a large set of similar methods:

  • toupper
  • swapcase

.. and many more methods.

I was thinking about some sort of a trait on the inner object. Would it be possible to de-duplicate the logic between all three python types by putting the logic in a trait like this:

pub struct ByteInner {}
pub struct StrInner {}

pub trait StrMethods {
  // This must be specifically implemented for ByteInner and StrInner:
  fn Vec<char> get_chars();
  fn set_chars(chars: Vec<char>);

  // These are generic methods:
  fn to_upper() {
    let chars = self.get_chars();
   // perform magic to upper here :)
    self.set_chars(modified_chars);
  }
  // All other similar methods like swapcase and friends go here.
}

impl StrMethods for ByteInner {
  // implement proper get_chars and set_chars
}

impl StrMethods for StrInner {
  // implement proper get_chars and set_chars
}

Would this be possible?

@adrian17
Copy link
Contributor

adrian17 commented May 4, 2019

While most bytes and bytearray methods can be merged, most bytes and str methods can't.

>>> 'ą'.upper()
'Ą'
>>> 'ą'.encode() == 'ą'.encode().upper()
True

@jgirardet
Copy link
Contributor Author

i'm not against the idea of the trait. but it adds much complexity due to many edge cases (like previous example).
But what would you think of some generic function for example like the split and split_reverse which would accept [u8] or str ?

@windelbouwman
Copy link
Contributor

I'm for generic functions which can take [u8] or str arguments. We could use this for a whole series of string functions. Maybe we could do some templating with parameters like u8 and char.

@windelbouwman
Copy link
Contributor

Something like this:

fn ljust<T>(src: &[T], width: i32, fill_char: T) -> [T] {
   // Implement ljust here...
}

@jgirardet
Copy link
Contributor Author

jgirardet commented May 5, 2019

I quickly looked into it. Even it's intuitive, it's definitely not as easy/smart that it seems.
Even they seems very similar str and [u8] have many differences. Here as the mains issues :

  • Most of the time, you want merge slice or str and you want a different return type (String or Vec).
  • due to this you won't merge by the same way : collect::<Vec>() for slice but clearly you prefer use format! for String.
  • In function that increment an initial empty result, you will use Vec.extend_from_slice where you will use push_str for String.
  • some methods already exists for str (ex : find, trim, ends_wih, matches) and are well implemented so would we switch for other one ?
  • edge cases ascii / unicode
  • you iterate strings with chars() but iter() for vec.
  • repeat is nightly for slice but stable for str

for all of this, I'm not sure we should try to do it. Even trying to reduce duplicate code, it (IMHO) will add to much complexity.

@windelbouwman windelbouwman merged commit 567fc4e into RustPython:master May 6, 2019
@windelbouwman
Copy link
Contributor

I'm for readability at the expense of duplication. Maybe we will figure a way in the future to do this.

@talhof8 talhof8 mentioned this pull request May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants