Skip to content

Improve: binary ops with Number Protocol #4615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Mar 9, 2023

Conversation

xiaozhiyan
Copy link
Contributor

@xiaozhiyan xiaozhiyan commented Mar 2, 2023

Closes #4614
Based on #4139
Related to #3244

This PR will not pass some tests at this moment, since Number Protocol is missing for some types.
Let me create separate PRs to implement (part of) missing Number Protocols, and then rebase this PR once those are merged.

closes #4638
closes #4639
closes #4672

@fanninpm fanninpm requested a review from qingshi163 March 2, 2023 14:55
@youknowone
Copy link
Member

when rustpython crashes on initialization step, debugger is very useful

https://github.com/RustPython/RustPython/wiki/Debugger-with-VSCode

@youknowone youknowone added the z-ls-2023 Tag to track Line OSS Sprint 2023 label Mar 2, 2023
@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch from 7c7f4de to 9bf7481 Compare March 2, 2023 17:43
@youknowone
Copy link
Member

youknowone commented Mar 2, 2023

The next step is update_slot in slot.rs.
When operator is overriden from Python side, we have to set a function to the python function from rust side.

In current version, this is not working:

>>>>> class A:
.....  def __add__(self, a):
.....   print(a)
..... 
>>>>> a = A()
>>>>> a + 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '+' not supported between instances of 'A' and 'int'
>>>>> 

Adding update_slot! will enable it.

@youknowone
Copy link
Member

To support simple binary operators, looking for index_wrapper or len_wrapper will be useful. One wrapper matches to one slot.
To support self-mutating unary ops like __iadd__ and __add__, looking for richcompare_wrapper will be useful. When user defined __add__ on python side, both __add__ and __iadd__ slots have to be set.(1 python function to n slots) richcompare_wrapper is doing similar work - but the opposite way. (n python funciton to 1 slot)

Comment on lines 234 to 263
pub enum PyNumberBinaryOpSlot {
Add,
Subtract,
Multiply,
Remainder,
Divmod,
Power,
Lshift,
Rshift,
And,
Xor,
Or,
InplaceAdd,
InplaceSubtract,
InplaceMultiply,
InplaceRemainder,
InplacePower,
InplaceLshift,
InplaceRshift,
InplaceAnd,
InplaceXor,
InplaceOr,
FloorDivide,
TrueDivide,
InplaceFloorDivide,
InplaceTrueDivide,
MatrixMultiply,
InplaceMatrixMultiply,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ok but let the op function to take a lambda should also work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the lambda is used? I thought python allows only whitelisted list of operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I mean closure in rust. Instead having a matching table from enum to methods(get_binary_op), we could make binary_ops generic that would take a FnOnce to specifiy which op will be use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like function pointer vs pointer offset design decision. I think either design works well enough to run the features.
Probably we can discuss better once this is fully done and glance at the design overview.

@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch from 9bf7481 to 4ae4521 Compare March 3, 2023 00:42
@youknowone

This comment was marked as resolved.

@youknowone youknowone force-pushed the binaryops-with-number-protocol branch 2 times, most recently from 81d51f4 to efa0535 Compare March 3, 2023 03:26
@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch 3 times, most recently from 0bef17e to 7f23449 Compare March 5, 2023 11:01
@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch from 7f23449 to 7f767ac Compare March 5, 2023 15:19
@xiaozhiyan
Copy link
Contributor Author

xiaozhiyan commented Mar 5, 2023

The following is the behavior in CPython

>>> class C:
...     def __add__(self, x):
...             return x + 100
...     def __radd__(self, x):
...             return x + 200
... 
>>> c = C()
>>> c + 5
105
>>> 5 + c
205
>>> class D:
...     def __add__(self, x):
...             return x + 1000
...     def __radd__(self, x):
...             return x + 2000
... 
>>> d = D()
>>> d + 5
1005
>>> 5 + d
2005
>>> c + d
1100
>>> d + c
1100
>>> class E:
...     def __add__(self, x):
...             return 10000 + x
...     def __radd__(self, x):
...             return 20000 + x
... 
>>> e = E()
>>> e + 5
10005
>>> 5 + e
20005
>>> c + e
10100
>>> e + c
10200

@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch from 7f767ac to e2b82e0 Compare March 5, 2023 20:03
@xiaozhiyan xiaozhiyan changed the title Improve: binaryops with Number Protocol Improve: binary ops with Number Protocol Mar 5, 2023
@youknowone
Copy link
Member

I finally understand the trick. instead of number_binary_op_wrapper, adding number_binary_op_left_wrapper and number_binary_op_right_wrapper allows to share single PyNumber operation. And they can be separately overriden from python side. We need another PyNumberSlots struct.

@xiaozhiyan xiaozhiyan force-pushed the binaryops-with-number-protocol branch from 8fe4b42 to 42e6107 Compare March 6, 2023 10:24
@xiaozhiyan
Copy link
Contributor Author

xiaozhiyan commented Mar 6, 2023

@youknowone In commit 1b9a5f6 , I extended PyNumberMethods to store right binary ops. The logic seems reasonable to me. Please help to have a look.

I also changed the comment of binary_op1 from b.op(a,b)[*], a.op(a,b), b.op(a,b) to the following.

    /// Calling scheme used for binary operations:
    ///
    /// Order operations are tried until either a valid result or error:
    ///   b.rop(b,a)[*], a.op(a,b), b.rop(b,a)
    ///
    /// [*] only when Py_TYPE(a) != Py_TYPE(b) && Py_TYPE(b) is a subclass of Py_TYPE(a)

@youknowone
Copy link
Member

youknowone commented Mar 6, 2023

@xiaozhiyan I made a rough sketch of the cpython way
42c417a

@youknowone
Copy link
Member

Ok, I made its first version working. f01901c using #4645

by using this structure, PyNumberSlots take responsibility of order of operands and PyNumberMethods take responsibility of operating itself.
Because PyNumberMethods now becomes fully immutable, we also can remove AtomicCell from all of its fields. It will make easier to inherit parent's PyNumberMethods for bool.
Not a big part but by sharing slot functions, cache hit rate may become higher.

@youknowone
Copy link
Member

youknowone commented Mar 6, 2023

one of the difference between my version and cpython is slot implementation.
I used different slot function for wrapper and number protocol implementation. But cpython use a single function for both cases
https://github.com/python/cpython/blob/6716254e71eeb4666fd6d1a13857832caad7b19f/Objects/typeobject.c#L7792-L7833

@youknowone

This comment was marked as outdated.

@@ -238,12 +306,12 @@ impl PyNumber<'_> {

// PyIndex_Check
pub fn is_index(&self) -> bool {
self.methods().index.load().is_some()
self.obj.class().slots.number.index.load().is_some()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary ops operands can be swapped, but is this also need to access obj.class() again?

Copy link
Contributor Author

@xiaozhiyan xiaozhiyan Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we

  • removed AtomicCell from PyNumberMethods (for both BinaryFunc and UnaryFunc)
  • changed update_slot to save user defined binary ops from ext_func number_methods.op to subslot number.op

=> user defined unary ops in update_slot have to be changed to number.op as well

=> then user defined unary ops will be saved in PyNumberSlots in the same way as binary ops

Since self.methods() only returns the static methods predefined in AsNumber implementations, we need to change to self.obj.class().slots.number to load the latest methods from PyNumberSlots.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, got it. we didn't handle unary operator yet.

Comment on lines -150 to -153
let lop = lhs.get_class_attr(reflection);
let rop = rhs.get_class_attr(reflection);
if let Some((lop, rop)) = lop.zip(rop) {
if !lop.is(&rop) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may lost this check. it seems to related to test_collections failure

Copy link
Contributor Author

@xiaozhiyan xiaozhiyan Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the purpose of the old code is to follow the CPython logic to check whether lop and rop are referring to the same op, in which case rop should not be used.

The new implementation uses the following part when loading rop, i.e. slot_b.

        let mut slot_b = if b.class().is(a.class()) {
            None
        } else {
            match b.class().slots.number.get_right_binary_op(op_slot)? {
                Some(slot_b)
                    if slot_b as usize == slot_a.map(|s| s as usize).unwrap_or_default() =>
                {
                    None
                }
                slot_b => slot_b,
            }
        };

a: &PyObject,
b: &PyObject,
op_slot: &PyNumberBinaryOpSlot,
) -> PyResult {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which format config should we use please? When I run cargo fmt, it will just change back to one line as before.

@fanninpm
Copy link
Contributor

fanninpm commented Mar 9, 2023

The latest CI failed with Clippy error, but that part of source has not been changed for a long time. Why suddenly CI treats it as an error and makes the CI fail?
https://github.com/RustPython/RustPython/actions/runs/4375751891/jobs/7656913583

Perhaps it has to do with Rust 1.68.0 being released.

@youknowone youknowone force-pushed the binaryops-with-number-protocol branch from 885eba9 to 3c4ac0e Compare March 9, 2023 16:37
Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me handle 1.68 things

@@ -214,6 +336,18 @@ fn float_wrapper(num: PyNumber, vm: &VirtualMachine) -> PyResult<PyRef<PyFloat>>
})
}

macro_rules! number_binary_op_wrapper {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this macro can generate function rather than lambda

@xiaozhiyan
Copy link
Contributor Author

xiaozhiyan commented Mar 9, 2023

@youknowone

let me handle 1.68 things

Please help to check my comments in #4673 in case they're useful.

@youknowone
Copy link
Member

Thanks!

@youknowone youknowone added C-enhancement New feature or request C-compat A discrepancy between RustPython and CPython A-design About RustPython's own implementation A-vm Area: virtual machine labels Mar 9, 2023
@youknowone youknowone merged commit ea665cb into RustPython:main Mar 9, 2023
@youknowone
Copy link
Member

Though we have remaining issues, let's move forward.

@xiaozhiyan Thank you so much for the huge effort!

@qingshi163
Copy link
Contributor

Sorry for late review, as long as the ci passed, we can accept this pr.

But there is something I want to clarify:

  1. Heap type is the type which created dynamicly like class in python etc.
  2. Heap type will have HeapTypeExt witch is the extra space only allocate for heap type.
  3. PyTypeSlots::as_number, as_sequence, as_mapping etc is a pointer to tell the type where to look for the function and it has only two possible place to pointing witch is either a static function from AsNumber::as_number or a self-referenced HeapTypeExt::number_methods etc.
  4. The type slots for number, sequence, mapping is not necessary but for speedup, witch means we can always fallback to a wrapper to call in python.
  5. For right_* functions I think we can have a generic way to do it, to put it into the type slots is not ideal, we do not need to store the r* functions, we just need to fallback the type.
  6. The PyNumber type it self is loosy, when convert from PyObject to PyNumber it will just give a empty implement if the object is not supporting number protocol.
  7. Cpython is inherit the slots but this makes the complex while you updateing the parent class's slot than need to recusively fallback all the subclasses's slots, the way we doing to search via mro maybe will slow down a little bit(I don't think will be big as most internal type should reach the protocol in the first loop) but much easier for now.

@youknowone
Copy link
Member

@qingshi163 All other parts sounds good, but for No. 5, for each binary operator, having 2 slots and 1 number method is the CPython design.

@qingshi163
Copy link
Contributor

@qingshi163 All other parts sounds good, but for No. 5, for each binary operator, having 2 slots and 1 number method is the CPython design.

https://github.com/python/cpython/blob/main/Include/cpython/object.h
https://github.com/python/cpython/blob/ca066bdbed85094a9c4d9930823ce3587807db48/Objects/typeobject.c#L7221-L7242

cpython do not store the right_* functions in the type slots, it fallback the slots and than call it opposite direction

@youknowone
Copy link
Member

we have a little bit different design now, so the wrapper looks different. but isn't wrap_binaryfunc_r the wrapper used by RBINSLOT as one of the 2 binaryfunc entries?

@youknowone youknowone mentioned this pull request Mar 11, 2023
7 tasks
@xiaozhiyan
Copy link
Contributor Author

xiaozhiyan commented Mar 13, 2023

Measure performance enhancement of this PR

% git log -n1
commit 1fceeab0fcc77c11dfe94092b27b2164b94d62e4 (HEAD)
Merge: 51020cd63 4d05077ec
Author: Jeong YunWon <69878+youknowone@users.noreply.github.com>
Date:   Thu Mar 9 21:19:47 2023 +0900

    Merge pull request #4674 from carlosmiei/copy-update
    
    Update copy.py from CPython 3.11

% cargo run --release benches/benchmarks/bm_nbody.py -o base.json
.....................
nbody: Mean +- std dev: 4.01 sec +- 0.17 sec

% cargo run --release benches/benchmarks/bm_nbody.py -o base2.json
.....................
nbody: Mean +- std dev: 3.91 sec +- 0.14 sec
% git log -n1
commit ea665cb743ef1f2a1a2b6030e8384d7ab814b305 (HEAD)
Merge: 1fceeab0f 228772020
Author: Jeong YunWon <69878+youknowone@users.noreply.github.com>
Date:   Fri Mar 10 02:47:01 2023 +0900

    Merge pull request #4615 from xiaozhiyan/binaryops-with-number-protocol
    
    Improve: binary ops with Number Protocol

% cargo run --release benches/benchmarks/bm_nbody.py -o patch.json
.....................
nbody: Mean +- std dev: 1.24 sec +- 0.08 sec

% cargo run --release benches/benchmarks/bm_nbody.py -o patch2.json
.....................
nbody: Mean +- std dev: 1.28 sec +- 0.12 sec
  • CPython 3.11
% python -V
Python 3.11.0

% python benches/benchmarks/bm_nbody.py -o cpython.json
.....................
nbody: Mean +- std dev: 101 ms +- 2 ms

% python benches/benchmarks/bm_nbody.py -o cpython2.json
.....................
nbody: Mean +- std dev: 104 ms +- 7 ms
  • Performance comparison
% pyperf compare_to cpython.json patch.json base.json --table 
+-----------+---------+-------------------------+-------------------------+
| Benchmark | cpython | patch                   | base                    |
+===========+=========+=========================+=========================+
| nbody     | 101 ms  | 1.24 sec: 12.28x slower | 4.01 sec: 39.79x slower |
+-----------+---------+-------------------------+-------------------------+

% pyperf compare_to cpython2.json patch2.json base2.json --table
+-----------+----------+-------------------------+-------------------------+
| Benchmark | cpython2 | patch2                  | base2                   |
+===========+==========+=========================+=========================+
| nbody     | 104 ms   | 1.28 sec: 12.34x slower | 3.91 sec: 37.54x slower |
+-----------+----------+-------------------------+-------------------------+

@@ -85,6 +84,7 @@ pub struct PyTypeSlots {

// The count of tp_members.
pub member_count: usize,
pub number: PyNumberSlots,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaozhiyan Sorry for late comment 😂Could I ask you the reason why not to use Option for PyNumberSlots? If we want to know which type implements the Number protocol, shouldn't we verify all the fields in the Number methods?

Copy link
Contributor Author

@xiaozhiyan xiaozhiyan Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a type to implement Number Protocol, basically it means the type implements a subset of Number Protocol methods, (while leaving the remained as None).
I'm not sure what the scenario is to check whether a type has implemented "at least one Number Protocol method", instead of checking whether a type has implemented "certain specified method".
If it's really needed in such scenarios, I think we may be able to change this implementation to use Option without side effect. Otherwise, it may be better to keep it as is to avoid an additional layer of Option wrapping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to difference of AsNumber/PyNumberSlots implementation strategy, i guess PyNumberSlots always exists while AsNumber is not.

Copy link
Contributor Author

@xiaozhiyan xiaozhiyan Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to current implementation,

  • for embedded types, if they don't implement AsNumber trait, extend_slots will not be executed, and their PyNumberSlots fields will contain default method implementation, i.e. None

  • for user defined types, if they implement any Number Protocol method, say, __radd__, the function will be saved in corresponding field of PyNumberSlots for the type

  • when trying to call certain Number Protocol method for a type, fields of PyNumberSlots will be looked up, without considering whether the type has implemented AsNumber trait or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Thanks for detail explanation 😊

@youknowone youknowone mentioned this pull request Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-design About RustPython's own implementation A-vm Area: virtual machine C-compat A discrepancy between RustPython and CPython C-enhancement New feature or request z-ls-2023 Tag to track Line OSS Sprint 2023
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Number Protocol for PyBool Improve: binary ops with Number Protocol
6 participants