Skip to content

replace array.__reduce__ to array.__reduce_ex__ #3876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
maong0927 opened this issue Jul 13, 2022 · 5 comments
Open

replace array.__reduce__ to array.__reduce_ex__ #3876

maong0927 opened this issue Jul 13, 2022 · 5 comments
Labels
A-stdlib C-compat A discrepancy between RustPython and CPython z-ca-2022 Tag to track contrubution-academy 2022

Comments

@maong0927
Copy link
Contributor

Feature

In rustPython's array pickling, reduce is used.
However, reduce_ex is used in cPython's pickle.

I think we should implement reduce_ex in the array and use reduce_ex for picking.

Execution Result

There is no related python documentation, so I will attach the execution result.
Screen Shot 2022-07-14 at 12 24 47 AM

@maong0927 maong0927 changed the title Requires implementation of array reduce_ex Requires implementation of array reduce_ex Jul 13, 2022
@Yaminyam
Copy link
Contributor

Yaminyam commented Jul 13, 2022

Although the #3611 issue mentions the implementation of __reduce_ex__, the actual tasks include only the implementation of __reduce_ex__.
However, when pickle dump is performed, __reduce_ex__ is executed as in this issue, so the implementation of __reduce_ex__ seems to be necessary a lot.
In particular, there are many cases where it is executed only with __reduce_ex__ without an implementation of __reduce__, such as array and list.

@youknowone youknowone added C-compat A discrepancy between RustPython and CPython A-stdlib z-ca-2022 Tag to track contrubution-academy 2022 labels Jul 13, 2022
@youknowone youknowone changed the title Requires implementation of array reduce_ex replace array.__reduce__ to array.__reduce_ex__ Jul 13, 2022
@fanninpm
Copy link
Contributor

This is likely why @Snowapril mentioned arrayiterator.__reduce__ in #3611. It's important to note that arrayiterator is distinct from array in that arrayiterator is what happens when you do this1:

>>> import array
>>> a = array.array('i', [1, 2, 3])
>>> type(iter(a))
<class 'arrayiterator'>
What methods does arrayiterator implement?
>>> help(iter(a))
Help on arrayiterator object:

class arrayiterator(object)
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(self, /)
 |      Return state information for pickling.
 |  
 |  __setstate__(self, state, /)
 |      Set state information for unpickling.
By contrast, what methods does array implement?
>>> help(a)
Help on array object:

class array(builtins.object)
 |  array(typecode [, initializer]) -> array
 |  
 |  Return a new array whose items are restricted by typecode, and
 |  initialized from the optional initializer value, which must be a list,
 |  string or iterable over elements of the appropriate type.
 |  
 |  Arrays represent basic values and behave very much like lists, except
 |  the type of objects stored in them is constrained. The type is specified
 |  at object creation time by using a type code, which is a single character.
 |  The following type codes are defined:
 |  
 |      Type code   C Type             Minimum size in bytes
 |      'b'         signed integer     1
 |      'B'         unsigned integer   1
 |      'u'         Unicode character  2 (see note)
 |      'h'         signed integer     2
 |      'H'         unsigned integer   2
 |      'i'         signed integer     2
 |      'I'         unsigned integer   2
 |      'l'         signed integer     4
 |      'L'         unsigned integer   4
 |      'q'         signed integer     8 (see note)
 |      'Q'         unsigned integer   8 (see note)
 |      'f'         floating point     4
 |      'd'         floating point     8
 |  
 |  NOTE: The 'u' typecode corresponds to Python's unicode character. On
 |  narrow builds this is 2-bytes on wide builds this is 4-bytes.
 |  
 |  NOTE: The 'q' and 'Q' type codes are only available if the platform
 |  C compiler used to build Python supports 'long long', or, on Windows,
 |  '__int64'.
 |  
 |  Methods:
 |  
 |  append() -- append a new item to the end of the array
 |  buffer_info() -- return information giving the current memory info
 |  byteswap() -- byteswap all the items of the array
 |  count() -- return number of occurrences of an object
 |  extend() -- extend array by appending multiple elements from an iterable
 |  fromfile() -- read items from a file object
 |  fromlist() -- append items from the list
 |  frombytes() -- append items from the string
 |  index() -- return index of first occurrence of an object
 |  insert() -- insert a new item into the array at a provided position
 |  pop() -- remove and return item (default last)
 |  remove() -- remove first occurrence of an object
 |  reverse() -- reverse the order of the items in the array
 |  tofile() -- write all items to a file object
 |  tolist() -- return the array converted to an ordinary list
 |  tobytes() -- return the array converted to a string
 |  
 |  Attributes:
 |  
 |  typecode -- the typecode character used to create the array
 |  itemsize -- the length in bytes of one array item
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __copy__(self, /)
 |      Return a copy of the array.
 |  
 |  __deepcopy__(self, unused, /)
 |      Return a copy of the array.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __le__(self, value, /)
 |      Return self<=value.
 |  
 |  __len__(self, /)
 |      Return len(self).
 |  
 |  __lt__(self, value, /)
 |      Return self<value.
 |  
 |  __mul__(self, value, /)
 |      Return self*value.
 |  
 |  __ne__(self, value, /)
 |      Return self!=value.
 |  
 |  __reduce_ex__(self, value, /)
 |      Return state information for pickling.
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  __rmul__(self, value, /)
 |      Return value*self.
 |  
 |  __setitem__(self, key, value, /)
 |      Set self[key] to value.
 |  
 |  __sizeof__(self, /)
 |      Size of the array in memory, in bytes.
 |  
 |  append(self, v, /)
 |      Append new value v to the end of the array.
 |  
 |  buffer_info(self, /)
 |      Return a tuple (address, length) giving the current memory address and the length in items of the buffer used to hold array's contents.
 |      
 |      The length should be multiplied by the itemsize attribute to calculate
 |      the buffer length in bytes.
 |  
 |  byteswap(self, /)
 |      Byteswap all items of the array.
 |      
 |      If the items in the array are not 1, 2, 4, or 8 bytes in size, RuntimeError is
 |      raised.
 |  
 |  count(self, v, /)
 |      Return number of occurrences of v in the array.
 |  
 |  extend(self, bb, /)
 |      Append items to the end of the array.
 |  
 |  frombytes(self, buffer, /)
 |      Appends items from the string, interpreting it as an array of machine values, as if it had been read from a file using the fromfile() method).
 |  
 |  fromfile(self, f, n, /)
 |      Read n objects from the file object f and append them to the end of the array.
 |  
 |  fromlist(self, list, /)
 |      Append items to array from list.
 |  
 |  fromunicode(self, ustr, /)
 |      Extends this array with data from the unicode string ustr.
 |      
 |      The array must be a unicode type array; otherwise a ValueError is raised.
 |      Use array.frombytes(ustr.encode(...)) to append Unicode data to an array of
 |      some other type.
 |  
 |  index(self, v, /)
 |      Return index of first occurrence of v in the array.
 |  
 |  insert(self, i, v, /)
 |      Insert a new item v into the array before position i.
 |  
 |  pop(self, i=-1, /)
 |      Return the i-th element and delete it from the array.
 |      
 |      i defaults to -1.
 |      i defaults to -1.
 |  
 |  remove(self, v, /)
 |      Remove the first occurrence of v in the array.
 |  
 |  reverse(self, /)
 |      Reverse the order of the items in the array.
 |  
 |  tobytes(self, /)
 |      Convert the array to an array of machine values and return the bytes representation.
 |  
 |  tofile(self, f, /)
 |      Write all items (as machine values) to the file object f.
 |  
 |  tolist(self, /)
 |      Convert array to an ordinary list with the same items.
 |  
 |  tounicode(self, /)
 |      Extends this array with data from the unicode string ustr.
 |      
 |      Convert the array to a unicode string.  The array must be a unicode type array;
 |      otherwise a ValueError is raised.  Use array.tobytes().decode() to obtain a
 |      unicode string from an array of some other type.
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  itemsize
 |      the size, in bytes, of one array item
 |  
 |  typecode
 |      the typecode character used to create the array
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None

Does this help?

Footnotes

  1. The following examples were run with CPython 3.9, but CPython 3.10 should yield similar results.

@Snowapril
Copy link
Contributor

Snowapril commented Jul 15, 2022

@Yaminyam We need to separate definition of __reduce__ and __reduce_ex__
https://docs.python.org/3/library/pickle.html?highlight=reduce_ex#object.__reduce__
https://docs.python.org/3/library/pickle.html?highlight=reduce_ex#object.__reduce_ex__

__reduce_ex__ is same with __reduce__ except it takes single argument (protocol version).

The main use for this method is to provide backwards-compatible reduce values for older Python releases.

Thus, if we don't need protocol version for implementing __reduce__, we don't have to define __reduce_ex__ over __reduce__

It seems true that we don't have to define __reduce__ although array.array already has a correct definition of __reduce_ex__. Just removing __reduce__ from array.array will be sufficient. Because of python script of pickling, it prefer __reduce_ex__ over __reduce__ for backward-compatibility

@fanninpm
Copy link
Contributor

fanninpm commented Jul 15, 2022

@qingshi163 implemented __reduce__ for array.array in #3064. I'm a little surprised that this discrepancy wasn't caught at the time.

@maong0927 Thanks for bringing this to our attention. As @Snowapril points out, the fix seems simple.

@youknowone
Copy link
Member

well, working is better than broken. I think we are going on correct way including #3064

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stdlib C-compat A discrepancy between RustPython and CPython z-ca-2022 Tag to track contrubution-academy 2022
Projects
None yet
Development

No branches or pull requests

5 participants