-
Notifications
You must be signed in to change notification settings - Fork 748
failure to convert numpy.float64 into System.Double #1936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
By design. NumPy array is not a C# array and vice versa. |
But this worked in 2.5. So what is the "correct" way of doing this now then? |
One fast method is listed here: #514 (comment) A faster and perhaps more convenient way would be to write a .NET function using |
Thanks! |
I'm going to push back on this. So I can't call: I have to call: I get the desire to make the conversion explicit rather than implicit. But blocking conversion of the elements is not the way to do that. You should also note that you can pass Python lists as a double[] parameter, and it works fine. If you want to make conversion explicit, then why not lists too? Your claim that this is by design is therefore suspect. Also, your idea of making a codec does not work. There does not appear to be a way to make a codec to convert to a primitive type. The decoder is not getting called, because Converter.DecodableByUser() is returning false for primitive types. I guess that's a performance choice? The simple change I suggested will make it so np.float64 will get converted to C# double. Presumably, a similar change can be made for np.int32. |
Please, don't act like we have never given this any thought. There is #1908 that I still want to get in (in some form) for 3.0 which tries to make things a bit better. I have to see whether I can clear up @lostmsu's concerns, it should definitely still be possible to hijack the conversion (though I think we do this part, checking against codecs, as the first step). |
@rzindler as I said before, make a codec. A codec would allow you to pass |
I did try that. As I said above, codecs are not being called when converting TO a primitive type, because Converter.DecodableByUser() returns false. This is the current code:
@filmor sorry if I gave the impression, that I thought you hadn't thought about stuff. |
@rzindler you are right, we did explicitly forbid codecs for the primitive types, and performance is indeed the reason. The fact, that lists are still converted is quite dubious too, but one questionable design decision should not be used as an excuse to add more. |
In fact, any iteratable object will be implicitly converted to a System.Array. numpy arrays are the only ones that fail, and only because their elements are not convertible. Not only are all iteratables implicitly converted, This bit of code below happens before the codecs are called, so you cannot make a codec for array conversions. Meaning no codecs are possible for either numpy arrays, or their elements.
If you really want to prevent all implicit conversion to arrays, all you have to do is delete this code (and all other references to ToArray(...)). Also, if you deleted it, then we could make a codec, which we can't right now. Perhaps you'd want to make a public version of ToArray() that is accessible from Python. This would enable explicit conversion, which AFAIK doesn't exist right now (except in extra Python code). Of course, this would fail for numpy arrays, unless the elements are convertible. So in any case, it is definitely desirable to have implicit conversions of the numpy element types. My original suggestion works for np.float64, but maybe you want to use the .item() member that all numpy element types have to get at their core value. Right now, even if I take a PyObject as a parameter for a numpy array, I'd have to bypass your entire conversion infrastructure and convert it manually. All calls to ToManaged(...) will fail, and there's no place to jump in without rewriting it. |
One more comment: |
@rzindler you're overreacting. It is mentioned in the release notes |
Firstly, I appreciate you guys taking the time to answer here. However, I am a bit frustrated that you seem to not be actually reading what I wrote when you reply. Our use case is we are converting chunks of a very large Python code base to C#, mostly for performance gains. There's a common situation where a callback function is given to a Python function. These callback functions take numpy arrays as parameters, and can be either Python or C# functions. Therefore, it's not practical to use explicit conversion on the Python side, because we don't know if the function is C# or Python. As the (3.0) code is now, codecs are not getting called, so there's basically no way to get this code working without modifying your code base. You have chosen to move in a direction that requires type conversions to be explicit. While this is in line with modern computer science thought, I'm going to push you to reconsider this direction for your situation. Especially the support for converting numpy arrays. I think you're making a big mistake.
Anyways, at this point, if you won't add implicit conversion of numpy element types, or at least support a way to add codecs for primitive types, I am either going to make my own bastardized version of your code (not a good option), or abandon its use. I have been pushing my team to use Python.NET. Please don't make me regret that. Thanks for your time. |
Come on. I already gave you a clear hint that this tone is not appreciated here. Leave the attitude out and let's stick to the technical discussion (where you are making valid points, it's just not an easy topic and lostmsu has valid points for the "other" side).
To improve performance of this approach /massively/ (as long as there is no implementation for #1838, though there is already a
This should be fixable as well, I'd consider the
We didn't choose this direction for some vague CS-academia reason. The problem with implicit conversions is that it gets /very/ difficult to adjust them, like the case that is mentioned here: #1908 (comment). Also mentioned there is the fact that we don't have a proper overload resolution. Overload resolution currently works by iterating through all signatures, checking if all given parameters can be converted one way or another, and then using the first one where that works. Adding a scoring functionality could help, but the code for this is already quite complicated and brittle, simply because Python does not have any notion of overloading.
I am not sure how to answer this. We are not providing you a product that we advertise for. We are simply trying to maintain a 15 year old library in a manner that allows us to go further (note that codecs are a relatively new feature, before all conversions had to be explicitly added directly in the codebase, which led to massive grievances e.g. with regard to automated conversions between lists and arrays) while keeping breakage to a minimum. Version 2.4 wouldn't have worked for you either.
Again we are just maintaining, we are not trying to sell anything to you. You are more than welcome to help out improving the situation. Your suggestion in the initial issue wouldn't work for the case @lostmsu lined out in the given comment. If you (or we together) can find a good way to fix this situation, we will be happy to merge it, and I consider a solution to this a blocker for 3.0 (which has been dragging on for a while, though, so I'd like to get this fixed soon).
Then read our answers as well. A PR for
We don't choose anything here. We don't want to introduce special-casing for Numpy types as that introduces a (weak) dependency on Numpy and we had quite bad experiences with depending on non-builtin types in the past. It would also be very confusing to have this work for Numpy and not for other "floaty" types. Either we fix it for everything with
I don't think this particular case works for
Well, if there are impedance mismatches, it's less seamless. That's how it works for all "inter-language" libraries, and we don't have a magical recipe for that either. Our current direction is that (by default, configuration should be done through codecs) we only want to provide mostly exact matches, by allowing both sides to use facade types. I'll have to read through our code again to see why this is applied "half-way" here, my expectation would be that this doesn't work without a codec. @lostmsu Do we have default codecs set up right now?
This doesn't add anything to the discussion at hand. Python's type system is strong in the sense that an object will never act as if it was of a different type depending on the context. Again, if that hasn't been clear before: I agree with you that this should be solvable via codecs. I disagree that some vague notion of what the library is supposed to be for should dictate the defaults. There is no way how we'd be able to emulate full duck-typing in C#.
I'm not commenting further on these two points, but I have read them. |
At the very least we can allow codec calls before As for having NumPy arrays autoconvert to C# arrays it has the same issue as the list to array conversion: you can mutate the object in C# but changes will not be reflected in Python. Generally, I recommend to wrap NumPy types on the C# side (which is what I do and SciSharp does). |
OK, thanks guys for the thoughtful replies. Written word does not accurately convey tone. I'm really not trying to give you snark. I appreciate your time. I was mostly frustrated when I said codecs weren't working and the reply was that I should use codecs. Now moving on in the spirit of collaboration... Your replies definitely paint a clearer picture to me. In particular the ambiguous overloading is not something I thought of. As an offhand suggestion, I'd say that rather than dealing with conversion priorities of the codecs, it should just fail if there is more than one possible overload (and none are exact matches of the types). This is what happens in compilers with ambiguous calls to overloaded functions. It is the perfect time to require explicit conversions. This would simplify and clean up a lot. Your priority and grouping scheme could all go away. Anyways, I'm really mostly interested in implicit conversion of the numpy primitive types (not just float). As for how I might do it, with the requirement of not depending on numpy being installed, and keeping performance fast. Here are some suggestions:
This would add no performance overhead for values that don't need codecs.
Maybe there's a better way, but this would work, would not really slow anything down, and I think would meet your criteria as well. |
I think allowing codecs to run on primitives after Why can't you make wrappers for NumPy types with custom codecs? Or maybe even just use SciSharp's NumPy wrapper and go from it? |
I really think you should throw an exception if there's ambiguity between overloads. Then the whole problem goes away. Anyone making both Also, I was thinking that there could be a separate list of codecs just for primitives. That would speed things up a bit, since there'd be shorter lists to parse, and you could potentially have different interfaces (i.e. simpler for the primitives). |
About using SciSharp's numpy wrapper, has that been done? Numpy.NET appears to embed Python within it. That's very different than being called from Python. It's not clear to me that it would work. Is there a way for it to use the Python instance it was called from? I couldn't find anything. |
If your C# library depends on Numpy.NET and is correctly deployed, Numpy.NET will be automatically loaded along with it. Mind you though that Numpy.NET currently does not target Python.NET 3.0.0 yet, so some work is required to bring it up to date. |
BTW, we already have this issue: #1276 |
I'll reopen this, there were plenty of issues with this already. I actually like the suggestion to never try to solve ambiguous situations. I know that this may be a hassle at the beginning, but it has the advantage that one can progress much much easier from that stance than from the current one. My suggestion would be:
We could progressively add more rules to resolve ambiguous situations (e.g. count the number of exact matching types, etc.) which should not break any existing code if done carefully. Our main issue right now (I think) is that we just use the first matching overload and stop trying at that point. |
Since #1908 is merged and released as rc6, I'll close this one again. |
Environment
Details
Passing in a numpy array to a C# function that takes a double[] parameter fails.
python code:
test_array = np.array([123, 345], np.float)
CSharpClass.Test(test_array)
C# code
public static void Test(double[] TestData) {...}
I already have a fix for this issue.
In Runtime.cs, change PyFloat_Check(...) to
This will convert any type that is derived from PyFloatType, which includes numpy.float64
The text was updated successfully, but these errors were encountered: