-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Reimpl Buffer Protocol and memoryview support ndarray with shape, stride and suboffset #3340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
vm/src/stdlib/io.rs
Outdated
#[derive(Debug)] | ||
#[pyclass(noattr, module = false, name = "_buffered_raw_buffer")] | ||
#[derive(Debug, PyValue)] | ||
struct BufferedRawBuffer { | ||
data: PyMutex<Vec<u8>>, | ||
range: Range<usize>, | ||
} | ||
impl BufferInternal for BufferedRawBuffer { | ||
fn obj_bytes(&self) -> BorrowedValue<[u8]> { | ||
BorrowedValue::map(self.data.lock().into(), |data| &data[self.range.clone()]) | ||
} | ||
#[pyimpl(with(AsBuffer))] | ||
impl BufferedRawBuffer {} | ||
|
||
fn obj_bytes_mut(&self) -> BorrowedValueMut<[u8]> { | ||
BorrowedValueMut::map(self.data.lock().into(), |data| { | ||
&mut data[self.range.clone()] | ||
}) | ||
} | ||
static BUFFERED_RAW_BUFFER_METHODS: BufferMethods = BufferMethods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to implement BufferedRawBuffer as an internal class so we can downcast from pyobjectref when the buffer methods is called. but I am getting panic for 'static type has not been initialized', anything missing here?
@youknowone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am sorry, I didn't catch I was mentioned here. You added pyclass
to BufferredRawBuffer
but with noattr
. I think a class is initialized only when it is a pyattr
. In this case, you can manually initailize it in make_modue
. Look for noattr
in vm/src/stdlib/sys.rs
. There are a few examples about it.
Probably we'd better to fix pymodule
macro to initialize them regardless of noattr
vm/src/protocol/buffer.rs
Outdated
// GUARANTEE: called only if the buffer option is contiguous | ||
pub contiguous: Option<fn(&PyObjectRef) -> BorrowedValue<[u8]>>, | ||
// GUARANTEE: called only if the buffer option is contiguous | ||
pub contiguous_mut: Option<fn(&PyObjectRef) -> BorrowedValueMut<[u8]>>, | ||
// collect bytes to buf when buffer option is not contiguous | ||
pub collect_bytes: Option<fn(&PyObjectRef, buf: &mut Vec<u8>)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three function should return the bytes that available in the current class. compare to obj_bytes and obj_bytes_mut that should return the whole bytes for the top class.
88b32ec
to
0041a81
Compare
vm/src/types/zoo.rs
Outdated
@@ -167,6 +168,7 @@ impl TypeZoo { | |||
none_type: singletons::PyNone::init_bare_type().clone(), | |||
not_implemented_type: singletons::PyNotImplemented::init_bare_type().clone(), | |||
generic_alias_type: genericalias::PyGenericAlias::init_bare_type().clone(), | |||
vec_buffer_type: VecBuffer::init_bare_type().clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had put the type here, it works, but I don't know does it has better solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't need to use it through vm.ctx.types.vec_buffer_type
, you don't need to add a member here. static types will be saved for each of their static_class.
(In other word, if you need to use it like that, this is the right place.)
How CPython handle it? Are those python classes there too? |
0041a81
to
aec309c
Compare
CPython is using void* with buffer options for the buffer data, we cannot do like that because we may have the lock before the data, so we have to visit data via function that base on each class. Secondly, CPython manage the exports by manually call PyBuffer_Release that will call the bf_releasebuffer slots, we are manage it more safer way by wrap it with PyBuffer. As we are using static functions for what buffer need, our appoarch shouldn't have a lot of overhead. For contiguous, contiguous_mut and collect_bytes, I think there is a way to rid out it by move some logic from memoryview to PyBuffer, I will work on it push to here. For VecBuffer, CPython don't have it because it use void pointer, we need type to retrive the data from the python object so I use VecBuffer to wrap a Vec to support as a buffer. |
For the polymorphism using |
The original design for void* case is to use dyn T(dyn BufferInternal).
so I decide to use static functions and downcast the obj to visit the class. the only issue for it is when we need buffer a rust vec, we need a wrap object that implement PyValue so we can able to downcast it. but I think that is only const cost when initialize and zero cost on the runtime. |
aec309c
to
111cd3b
Compare
vm/src/types/zoo.rs
Outdated
@@ -81,6 +81,7 @@ pub struct TypeZoo { | |||
pub none_type: PyTypeRef, | |||
pub not_implemented_type: PyTypeRef, | |||
pub generic_alias_type: PyTypeRef, | |||
pub vec_buffer_type: PyTypeRef, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youknowone can you help me to move it away?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attached a commit for this
vm/src/builtins/slice.rs
Outdated
} | ||
} | ||
|
||
pub struct SaturatedSliceIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust iterators are typically named Iter
(mostly because Iterator
is a trait name)
pub struct SaturatedSliceIterator { | |
pub struct SaturatedSliceIter { |
vm/src/builtins/slice.rs
Outdated
pub index: isize, | ||
pub step: isize, | ||
pub len: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fields of iterator, but is it pub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it should not be pub.
vm/src/protocol/buffer.rs
Outdated
self.internal.obj_bytes().to_vec() | ||
/// # Safety | ||
/// assume the buffer is contiguous | ||
pub unsafe fn contiguous(&self) -> BorrowedValue<[u8]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are going to name this under rust convention,
pub unsafe fn contiguous(&self) -> BorrowedValue<[u8]> { | |
pub unsafe fn contiguous_unchecked(&self) -> BorrowedValue<[u8]> { |
vm/src/protocol/buffer.rs
Outdated
self.obj_bytes_mut() | ||
} | ||
|
||
pub fn collect(&self, buf: &mut Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collect
typically returns the collection itself, but this is extending existing vector.
pub fn collect(&self, buf: &mut Vec<u8>) { | |
pub fn append_to(&self, buf: &mut Vec<u8>) { |
vm/src/protocol/buffer.rs
Outdated
|
||
/// this function do not check the bound | ||
/// panic if indices.len() != ndim | ||
pub fn get_position_fast(&self, indices: &[usize]) -> isize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the name get_
is not coming from CPython, I want to suggest to remove it because this function is not perform getter or getitem. Just fast_position
like fast_getitem
?
6c6007e
to
ffeb638
Compare
vm/src/builtins/memory.rs
Outdated
let format_spec = Self::parse_format(&buffer.desc.format, vm)?; | ||
let desc = buffer.desc.clone(); | ||
|
||
let mut zelf = PyMemoryView { | ||
buffer: ManuallyDrop::new(buffer), | ||
released: AtomicCell::new(false), | ||
start: range.start * itemsize, | ||
stop: range.end * itemsize, | ||
step: 1, | ||
start: 0, | ||
format_spec, | ||
desc, | ||
hash: OnceCell::new(), | ||
} | ||
.validate()) | ||
} | ||
|
||
fn as_contiguous(&self) -> Option<BorrowedValue<[u8]>> { | ||
if !self.buffer.options.contiguous { | ||
return None; | ||
} | ||
Some(self.obj_bytes()) | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these lines looks same as from_buffer
. then will let mut zelf = Self::from_buffer(buffer, vm)
works?
vm/src/builtins/memory.rs
Outdated
|
||
fn retain(&self) {} | ||
} | ||
fn collect(&self, buf: &mut Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is another collect not returning buffer
vm/src/protocol/buffer.rs
Outdated
@@ -123,17 +147,293 @@ impl TryFromBorrowedObject for PyBuffer { | |||
// but it is not supported by Rust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment looks outdated
vm/src/protocol/buffer.rs
Outdated
|
||
#[pyimpl(flags(BASETYPE), with(Constructor))] | ||
impl VecBuffer { | ||
pub fn new(data: Vec<u8>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From is typically implemented for conversion
We should have the _test_buffer for multi-dimention buffer test. but I think this pr need to be merge now to fix the existing issue. |
There was a misunderstanding about the code for buffer protocol I wrote at beginning, so the related pr #3029 and #3043 finally broke the memoryview(the test did not show). When I review there pr I also just broken the mind, really sorry about that.
But in the pr I reimplement the protocol with more efficient way(thanks for #3121), this should works better.