-
Notifications
You must be signed in to change notification settings - Fork 748
1838 Use Python buffer protocol when converting Python objects to .NET arrays #2372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ | |
using System.Collections; | ||
using System.Collections.Generic; | ||
using System.Diagnostics; | ||
using System.Linq; | ||
using System.Reflection; | ||
using System.Runtime.InteropServices; | ||
using System.Security; | ||
|
@@ -880,12 +881,49 @@ private static void SetConversionError(BorrowedReference value, Type target) | |
/// Convert a Python value to a correctly typed managed array instance. | ||
/// The Python value must support the Python iterator protocol or and the | ||
/// items in the sequence must be convertible to the target array type. | ||
/// If the Python value supports the buffer protocol the underlying buffer | ||
/// will be copied directly. | ||
/// </summary> | ||
private static bool ToArray(BorrowedReference value, Type obType, out object? result, bool setError) | ||
{ | ||
Type elementType = obType.GetElementType(); | ||
result = null; | ||
|
||
// structs can also be blittable but for the standard usecases this suffices | ||
var isBlittable = elementType.IsPrimitive && elementType != typeof(char) && elementType != typeof(bool); | ||
if (isBlittable && Runtime.PyObject_GetBuffer(value, out var view, (int)PyBUF.CONTIG_RO) == 0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure it is safe to assume getting buffer has the same semantics as using iterator for an arbitrary type that supports GetBuffer. Imho, this could be made the default behavior for converting a fixed set of standard types, but for the others it might be better to expose some kind of explicit call instead of conversion. E.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this caught by the primitive blitable check? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could move the whole thing to after the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This issue is not about element types, but about container types. Does Python guarantee, that GetBuffer()[0], GetBuffer()[1], etc is the same sequence as .iter[0], .iter[1], etc for every type that implements GetBuffer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought that was the point of the buffer protocol :). I'll read up more on it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @chickenservice the concern is about breaking changes, especially within Python.NET 3.x, more especially within Python.NET 3.0.x. If there's a difference in result, we should delay the change of behavior to 4.x, and in 3.x only add new functions. IMHO, we can only immediately apply the speedups for the types that we know have the same behavior. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lostmsu that makes sense. The tests cover bytearray, array.array and ndarray. From my experience these behave according to what you described and are the canonical implementations anyway. Any concerns? Are there other types you have in mind? Should we limit ndim == 1 for now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the tests do not cover multidim or the behavior in multidim is different, we should limit ndim to 1. But again, I strongly suggest to add an explicit new method to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I misunderstood your suggestion, to clarify: For upcoming minor/patch release expose this as a dedicated method and remove it from For the next major release reevaluate whether this can/should be done implicitly in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @chickenservice yep, that's exactly right. |
||
{ | ||
GCHandle ptr; | ||
try | ||
{ | ||
var shape = new IntPtr[view.ndim]; | ||
Marshal.Copy(view.shape, shape, 0, view.ndim); | ||
Array arr = Array.CreateInstance(elementType, shape.Select(x => (long)x).ToArray()); | ||
ptr = GCHandle.Alloc(arr, GCHandleType.Pinned); | ||
var addr = ptr.AddrOfPinnedObject(); | ||
unsafe | ||
{ | ||
Buffer.MemoryCopy((void*)view.buf, (void*)addr, arr.Length * Marshal.SizeOf(elementType), view.len); | ||
} | ||
|
||
result = arr; | ||
return true; | ||
} | ||
finally | ||
{ | ||
if (ptr.IsAllocated) | ||
{ | ||
ptr.Free(); | ||
} | ||
|
||
Runtime.PyBuffer_Release(ref view); | ||
} | ||
} | ||
else | ||
{ | ||
Exceptions.Clear(); | ||
} | ||
|
||
using var IterObject = Runtime.PyObject_GetIter(value); | ||
if (IterObject.IsNull()) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both directions need a check that source and target element types match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be ok to match the typecode of the .NET type to the corresponding format string of the buffer, i.e. roughly like so:
This would require #2371 to be fixed so I can actually retrieve the format field.