Implement Buffer Protocol #2226

qingshi163 · 2020-09-21T07:24:42Z

I add a slot for implement the buffer protocol, more or less like what CPython doing. I am not sure is the right way for RustPython. Also I am confusing about the slots seem do not inherit from the base? So how we control the behavour of the subclass?

youknowone · 2020-09-21T11:38:24Z

Slots is a data field of type (PyClass). For now, we typically iterate mro to find proper field of slots.

coolreader18 · 2020-09-21T18:23:48Z

vm/src/obj/objmemory.rs

+use crate::{bytesinner::try_as_bytes, pyobject::IntoPyObject};
+use crossbeam_utils::atomic::AtomicCell;
+
+pub trait BufferProtocol: Debug + Sync + Send {


Instead of Send + Sync here, you probably want to use PyThreadingConstraint from pyobject

youknowone · 2020-09-24T13:45:11Z

related to #2125 and #2195, right?

qingshi163 · 2020-09-24T15:26:47Z

#2195 I have done it, I will push out when I passed the memoryview unittest.
#2125 I don't think it works with buffer, we have to do something inside objbytes.
@youknowone

youknowone · 2020-09-25T04:28:21Z

@qingshi163 I attached a commit to change Box to plain fn c0ecc35 . Because this is using read-only slot unlike other ones, It needed different storing method. I couldn't catch it on the previous comments.

youknowone · 2020-09-25T22:48:20Z

vm/src/slots.rs

@@ -376,3 +381,14 @@ impl PyComparisonOp {
        }
    }
 }
+
+#[pyimpl]
+pub trait Bufferable: PyValue {


I think the original name AsBuffer you used was better. "hashable object" is a term in python(or cpython?) but bufferable is not.

Though "comparable object" is also nothing in python. @coolreader18 any idea about naming?

I think AsBuffer is good, "buffer" isn't a verb so it doesn't really make sense as Verbable and there's precedence for AsNoun as a trait name in Rust, e.g. AsRawFd, AsRef

AsBuffer is good, but the trait function name I changed it to get_buffer as same as CPython, also because it did return a new BufferProtocol. so we should go back to AsBuffer or something like GetBuffer? what we perfer?

I personally think we can keep it as AsBuffer

That sounds like the implementation trait name can be Buffer and the desriptor trait name can be BufferProtocol. Like,

AsBuffer(Bufferable) -> BufferProtocol

BufferProtocol -> Buffer

Also I personally prefer to reuse CPython terms if there is no special reason not to do. Because we are practically read and port CPython code everyday. So always +1 for following CPython naming.

That makes sense, since a trait is inherently a "protocol", so there's no need to specify that stuff that a buffer can do is a protocol; it's redundant

That sounds good!

qingshi163 · 2020-09-26T08:22:06Z

vm/src/obj/objmemory.rs

+
+    fn as_contiguous(&self) -> Option<BorrowedValue<[u8]>> {
+        let options = self.get_options();
+        if !options.contiguous {
+            return None;
+        }
+        Some(self.obj_bytes())
+    }
+
+    fn as_contiguous_mut(&self) -> Option<BorrowedValueMut<[u8]>> {
+        let options = self.get_options();
+        if !options.contiguous {
+            return None;
+        }
+        Some(self.obj_bytes_mut())
+    }
+


the consumer should call as_contiguous if they want just a plain buffer. Should we also need to add a function here for consuming a non-contiguous buffer like a sliced memoryview? Like a iter() return the unpacked object?

qingshi163 · 2020-09-26T08:22:15Z

vm/src/obj/objmemory.rs

+pub trait BufferProtocol: Debug + PyThreadingConstraint {
+    // TODO: return reference to avoid copy
+    fn get_options(&self) -> BufferOptions;
+    fn obj_bytes(&self) -> BorrowedValue<[u8]>;
+    fn obj_bytes_mut(&self) -> BorrowedValueMut<[u8]>;
+    fn release(&self);
+    fn is_resizable(&self) -> bool;


I renamed as_bytes to obj_bytes because it should always return the full memory range for the original object. as_bytes may easier to be miss using.

qingshi163 · 2020-09-26T08:29:18Z

vm/src/stdlib/pystruct.rs

    impl FormatCode {
        fn unit_size(&self) -> usize {
+            // XXX: size of l L q Q is platform depended?
            match self.code {
                'x' | 'c' | 'b' | 'B' | '?' | 's' | 'p' => 1,
                'h' | 'H' => 2,
-                'i' | 'l' | 'I' | 'L' | 'f' => 4,
-                'q' | 'Q' | 'd' => 8,
+                'i' | 'I' | 'f' => 4,
+                'l' | 'L' | 'q' | 'Q' | 'd' => 8,
                'n' | 'N' | 'P' => std::mem::size_of::<usize>(),
                c => {


I have to change the unit size for 'l' and 'L' to 8, because that is how array.array doing. I don't know is any compatible problem, but in my machine CPython 'l' and 'L' is 8 bytes.

isn't it architecture dependent value?

/* Integers */ case 'h': intsize = sizeof(short); is_signed = 1; break; case 'H': intsize = sizeof(short); is_signed = 0; break; case 'i': intsize = sizeof(int); is_signed = 1; break; case 'I': intsize = sizeof(int); is_signed = 0; break; case 'l': intsize = sizeof(long); is_signed = 1; break; case 'L': intsize = sizeof(long); is_signed = 0; break; case 'q': intsize = sizeof(long long); is_signed = 1; break; case 'Q': intsize = sizeof(long long); is_signed = 0; break;

This is from CPython, should we put it somewhere together? because we have to make sure array.array and struct have the same size for the type.

For common platforms, using std::mem::size_of::<isize> or cfg target_pointer_width work. It fits for common platforms but not perfect. using libc always will fit as it defined in CPython. Try std::mem::size_of::<libc::c_int>

I think we should do it in another PR and try unify the reflect from format code to size and the type.

coolreader18 · 2020-09-26T17:57:36Z

common/src/cell.rs

+                drop(s.1);
+                read_rwlock(s.0)


Suggested change

drop(s.1);

read_rwlock(s.0)

s.1

coolreader18 · 2020-09-26T18:02:12Z

common/src/borrow.rs

+            Self::Ref(r) => BorrowedValue::Ref(f(r)),
+            Self::MuLock(m) => BorrowedValue::MappedMuLock(PyMutexGuard::map(m, |x| unsafe {
+                #[allow(mutable_transmutes, clippy::transmute_ptr_to_ptr)]
+                std::mem::transmute(f(x))


This isn't sound -- somebody could match to get the mutex guard out of the enum, and then treat it as mutable. I ran into the same problem, and I'm not sure exactly what the solution is. Maybe try something in cell? PyImmutableMappedMutexGuard or something?

I could try making that later today if you don't feel confident about it; I think it would be like { data: *const T, raw: &'a parking_lot::RawMutex, _marker: PhantomData<(&'a T, RawMutex::GuardMarker)> }

If you can fix that will be perfect, I have no clue to get it soundless..

@coolreader18 will you merge #2239 before or after the pr?

coolreader18 · 2020-10-01T15:56:40Z

Oooh, here's an idea: Buffer could require PyObjectPayload as a supertrait, and then get_buffer() could return a PyObjectRc<dyn Buffer>, and then the Buffer and the obj in the memoryview could be the same thing.

qingshi163 · 2020-10-01T18:44:29Z

But the obj and the buffer in the memoryview could be different, the obj should always point to the original data source and the buffer is the controller shows how can we visit the data. That is how I implement it as the memoryview build from a memoryview the obj is cloned from original object.

qingshi163 · 2020-10-01T18:47:41Z

I want to know if it is ok to merge, it is not fully completed but works.

coolreader18 · 2020-10-01T18:51:39Z

Oh, I suppose that's true. Nevermind

youknowone · 2020-10-01T21:06:08Z

Lib/test/test_memoryview.py

@@ -114,11 +112,13 @@ def setitem(key, value):
        self.assertRaises(TypeError, setitem, "a", b"a")
        # Not implemented: multidimensional slices
        slices = (slice(0,1,1), slice(0,1,2))
-        self.assertRaises(NotImplementedError, setitem, slices, b"a")
+        # TODO: RUSTPYTHON


I know this PR enabled a lot of parts from this test, but unless it is fully resolved, I prefer to keep the whole test as expectedFailure before merge. this is only trackable by code, but expectedFailure is trackable by test result. Once the feature is fully done, it will be detected if it is expectedFailiure but not if it is commented out.

youknowone · 2020-10-01T21:07:48Z

extra_tests/snippets/memoryview.py

@@ -6,7 +6,7 @@
 a = memoryview(obj)
 assert a.obj == obj

-assert a[2:3] == b"c"
+# assert a[2:3] == b"c"


is this intended?

youknowone · 2020-10-01T21:08:47Z

vm/src/bytesinner.rs

            l @ PyList => l.to_byte_inner(vm),
+            // TODO: PyTyple


qingshi163 · 2020-10-02T13:33:39Z

vm/src/bytesinner.rs

        op: PyComparisonOp,
        vm: &VirtualMachine,
    ) -> PyComparisonValue {
+        // TODO: bytes can compare with any object implemented buffer protocol
+        // but not memoryview, and not equal if compare with unicode str(PyStr)
        PyComparisonValue::from_option(
            try_bytes_like(vm, other, |other| {
                op.eval_ord(self.elements.as_slice().cmp(other))


Now we have different behavour between (memoryview op bytes) and (bytes op memoryview), Is any solution better than check the type here?
@coolreader18

Does memory view not implement the buffer protocol? I think that would maybe fix the issue(?)

But memoryview is implemented buffer protocol, we will have much more issue if not so.

Huh, that's strange, I think that's probably fine for now.

qingshi163 · 2020-10-02T13:41:14Z

vm/src/bytesinner.rs

+            // TODO: generic way from &[PyObjectRef]
            l @ PyList => l.to_byte_inner(vm),
+            t @ PyTuple => t.to_bytes_inner(vm),
            obj => {
                let iter = vm.get_method_or_type_error(obj.clone(), "__iter__", || {
                    format!("a bytes-like object is required, not {}", obj.class())


I think we could have a trait for the objects that can be borrowed as &[PyObjectRef], so we can have a generic and efficiency way to iter.

qingshi163 · 2020-10-06T08:52:28Z

What that random ci fail comes from?

coolreader18 · 2020-10-08T05:02:53Z

@qingshi163 I rebased this to master with the new mutex/BorrowedValue::map implementations; it put me as a committer for all the commits but hopefully you can just rebase it yourself and it would get rid of that.

qingshi163 · 2020-10-08T10:25:12Z

@coolreader18 Thanks, I think I need review for merge now.

coolreader18

I think this overall looks really good!! I'm fine with merging it and fixing leftover issues later, since this is so big already. @youknowone what do you think?

coolreader18 · 2020-10-08T23:39:34Z

And don't worry about that CI failure; I can fix that when I merge.

youknowone · 2020-10-10T06:43:20Z

oops, after a few days of resting, I totally forgot to merge this first before other PRs. I agree to merge this and fix other stuffs later. Let me fix the conflicts.

youknowone · 2020-10-10T08:06:26Z

I fixed a few commit messages because fisrt two 'fix test' commits are not fixing test but disabling it.
I also reordered commits a bit to merge "update memoryview eq"s to be single commit and also "Implement buffer protocol" and "Rename to AsBuffer". The contents of them are not changed except for conflict resolution.

coolreader18 mentioned this pull request Sep 21, 2020

Move BorrowValue to rustpython-common, add BorrowedValue enum #2228

Merged

coolreader18 reviewed Sep 21, 2020

View reviewed changes

qingshi163 force-pushed the buffer_protocol branch from 7a3ae26 to df78720 Compare September 24, 2020 19:23

youknowone reviewed Sep 25, 2020

View reviewed changes

qingshi163 force-pushed the buffer_protocol branch from c0ecc35 to e769804 Compare September 26, 2020 08:04

qingshi163 commented Sep 26, 2020

View reviewed changes

qingshi163 force-pushed the buffer_protocol branch from e769804 to 258ffcb Compare September 26, 2020 15:23

coolreader18 reviewed Sep 26, 2020

View reviewed changes

coolreader18 mentioned this pull request Sep 27, 2020

Unified lock types for rustpython_common #2239

Merged

qingshi163 force-pushed the buffer_protocol branch 2 times, most recently from 3403be7 to 9e5032f Compare October 1, 2020 09:39

qingshi163 marked this pull request as ready for review October 1, 2020 18:45

qingshi163 changed the title ~~[WIP] Implement Buffer Protocol~~ Implement Buffer Protocol Oct 1, 2020

youknowone reviewed Oct 1, 2020

View reviewed changes

youknowone reviewed Oct 2, 2020

View reviewed changes

vm/src/bytesinner.rs Outdated

l @ PyList => l.to_byte_inner(vm),

// TODO: PyTyple

Copy link

Member

youknowone Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyTuple?

qingshi163 force-pushed the buffer_protocol branch 2 times, most recently from 6c72e69 to 9c3d97b Compare October 2, 2020 13:27

qingshi163 commented Oct 2, 2020

View reviewed changes

qingshi163 force-pushed the buffer_protocol branch from 9c3d97b to 604bd05 Compare October 6, 2020 07:48

qingshi163 force-pushed the buffer_protocol branch 2 times, most recently from dcca305 to 4e08961 Compare October 7, 2020 15:36

coolreader18 force-pushed the buffer_protocol branch from 4e08961 to 2ed49bc Compare October 8, 2020 00:52

coolreader18 force-pushed the buffer_protocol branch from 07959c2 to 896590c Compare October 8, 2020 17:18

coolreader18 approved these changes Oct 8, 2020

View reviewed changes

youknowone force-pushed the buffer_protocol branch from 896590c to 8a0b2d9 Compare October 10, 2020 07:39

qingshi163 added 11 commits October 10, 2020 16:57

Implement Buffer Protocol

499d997

disable few test stdlib_struct

2a4fbaa

disable few test in test_xdrlib

4489ba1

fix unittest test_array.test_buffer

e1ddbda

Fix more to pass unittest

cf35723

memoryview __setitem__

275b727

bytesinner from pytuple

3fe8a30

BytesIO readinto()

49b2c19

BytesIO getbuffer; no more hack for pickle.py

b48c5a2

memoryview hex

6058d65

update memoryview eq

88f5466

youknowone force-pushed the buffer_protocol branch from 8a0b2d9 to 88f5466 Compare October 10, 2020 08:04

youknowone merged commit 54cfdf2 into RustPython:master Oct 10, 2020

qingshi163 mentioned this pull request Nov 1, 2020

[RFC] Iterator for memoryview #2195

Closed

qingshi163 deleted the buffer_protocol branch December 7, 2020 06:20

youknowone mentioned this pull request Mar 7, 2023

Implement protocols (Abstract obejcts layer) #3244

Open

Implement Buffer Protocol #2226

Implement Buffer Protocol #2226

Uh oh!

Conversation

qingshi163 commented Sep 21, 2020

Uh oh!

youknowone commented Sep 21, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youknowone commented Sep 24, 2020

Uh oh!

qingshi163 commented Sep 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

youknowone commented Sep 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youknowone Sep 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

youknowone Sep 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coolreader18 Sep 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coolreader18 Sep 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coolreader18 commented Oct 1, 2020

Uh oh!

qingshi163 commented Oct 1, 2020

Uh oh!

qingshi163 commented Oct 1, 2020

Uh oh!

coolreader18 commented Oct 1, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qingshi163 commented Sep 24, 2020 •

edited

Loading

youknowone commented Sep 25, 2020 •

edited

Loading

youknowone Sep 26, 2020 •

edited

Loading

youknowone Sep 26, 2020 •

edited

Loading

coolreader18 Sep 26, 2020 •

edited

Loading

coolreader18 Sep 26, 2020 •

edited

Loading

qingshi163 Oct 2, 2020 •

edited

Loading

youknowone commented Oct 10, 2020 •

edited

Loading