Skip to content

Use LinkedList in scope locals #1329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 4, 2019

Conversation

palaviv
Copy link
Contributor

@palaviv palaviv commented Aug 31, 2019

I hoped this change will increase the speed a little by accessing the last element without iterating but
from checking the benchmarks this does not have an effect on the speed.
Before:

--------------------------------------------------------------------------------------------- benchmark: 4 tests ---------------------------------------------------------------------------------------------
Name (time in ms)                             Min                 Max                Mean             StdDev              Median                IQR            Outliers      OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_bench[nbody.py-cpython]              27.4479 (1.0)       29.5264 (1.0)       27.7710 (1.0)       0.3353 (1.0)       27.6960 (1.0)       0.2146 (1.0)           1;1  36.0088 (1.0)          36           1
test_bench[mandelbrot.py-cpython]         44.6213 (1.63)      49.8505 (1.69)      46.3056 (1.67)      1.1451 (3.42)      46.1056 (1.66)      1.5482 (7.22)          4;1  21.5957 (0.60)         22           1
test_bench[nbody.py-rustpython]          253.9762 (9.25)     275.2160 (9.32)     259.4745 (9.34)      8.9302 (26.64)    255.2859 (9.22)      7.7041 (35.91)         1;1   3.8539 (0.11)          5           1
test_bench[mandelbrot.py-rustpython]     775.8832 (28.27)    809.1849 (27.41)    786.2342 (28.31)    13.9292 (41.54)    779.6782 (28.15)    17.8695 (83.29)         1;0   1.2719 (0.04)          5           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

After:

--------------------------------------------------------------------------------------------- benchmark: 4 tests --------------------------------------------------------------------------------------------
Name (time in ms)                             Min                 Max                Mean            StdDev              Median                IQR            Outliers      OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_bench[nbody.py-cpython]              27.5833 (1.0)       28.7476 (1.0)       27.8824 (1.0)      0.2265 (1.0)       27.8289 (1.0)       0.3121 (1.0)           8;1  35.8649 (1.0)          36           1
test_bench[mandelbrot.py-cpython]         44.8888 (1.63)      47.8632 (1.66)      46.2177 (1.66)     0.8791 (3.88)      45.9631 (1.65)      1.0685 (3.42)          7;0  21.6367 (0.60)         22           1
test_bench[nbody.py-rustpython]          254.4413 (9.22)     257.1632 (8.95)     255.8780 (9.18)     1.1675 (5.16)     256.1233 (9.20)      2.0518 (6.57)          2;0   3.9081 (0.11)          5           1
test_bench[mandelbrot.py-rustpython]     771.7654 (27.98)    795.8604 (27.68)    779.6076 (27.96)    9.8172 (43.35)    777.5865 (27.94)    12.3309 (39.51)         1;0   1.2827 (0.04)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

@cthulahoops
Copy link
Collaborator

If it's doubly-linked doesn't this break tail sharing? (Ie. the clone requires a copy of the list?)

This code is still the wrong structure, so I don't think we should be making this kind of small optimisation until we've fixed the big problems.

@cthulahoops
Copy link
Collaborator

To be more precise load_global should be able to skip over the list of scopes and just access the globals. The call to last should go away.

@palaviv
Copy link
Contributor Author

palaviv commented Aug 31, 2019

I think I should have stated that one of the main goals of this change is to use use std::collections::LinkedList instead of implementing one our self. I think this simplifies the code and make it more easy to understand.

If it's doubly-linked doesn't this break tail sharing? (Ie. the clone requires a copy of the list?)

You are correct but as can be seen from the performance benchmarks this does not have a negative effect. We have a list of referenced counted object creating a copy of the list is very cheap.

To be more precise load_global should be able to skip over the list of scopes and just access the globals. The call to last should go away.

Even without the fast access to last I think the simplification to the code is worth the change.

@cthulahoops
Copy link
Collaborator

I think I should have stated that one of the main goals of this change is to use use std::collections::LinkedList instead of implementing one our self. I think this simplifies the code and make it more easy to understand.

It's nice to see the code reduction, but I'm not sure it's worth using the wrong data structure for. I'm guessing there's no standard library implementation of RcList?

You are correct but as can be seen from the performance benchmarks this does not have a negative effect. We have a list of referenced counted object creating a copy of the list is very cheap.

The benchmarks don't use nested scopes at all. mandelbrot doesn't even have a local scope. That said, very deeply nesting scopes isn't common in python, so the list should usually be short. (The slow down would be on calling closures, I think.)

@windelbouwman
Copy link
Contributor

I propose to merge this after #1333 gets merged. I like the usage of a standard linked list implementation, to reduce code.

Copy link
Contributor

@windelbouwman windelbouwman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@windelbouwman
Copy link
Contributor

Too bad we do no longer share the tail of the list, but I think this is okay for now, and can be added back later on, if required.

@cthulahoops
Copy link
Collaborator

Trying out a more relevant benchmark suggests that this branch isn't significantly slower. (Confirming the suggestion above that the short linked list copy is cheap.)

Master:
test bench_rustpy_closure ... bench: 38,520 ns/iter (+/- 2,478)

This branch:
test bench_rustpy_closure ... bench: 38,823 ns/iter (+/- 713)

I still worry that the copy here will get forgotten / need undoing, especially as the list elements may not be simple reference counted objects once local scopes are implemented properly.

@coolreader18
Copy link
Member

What about just a plain Vec? If we're not doing tail-sharing, I don't think a LinkedList has all that many benefits compared to a Vec.

@palaviv
Copy link
Contributor Author

palaviv commented Sep 4, 2019

What about just a plain Vec? If we're not doing tail-sharing, I don't think a LinkedList has all that many benefits compared to a Vec.

Now that we do not access the last value LinkedList is not needed. I changed it to Vec

vm/src/scope.rs Outdated
}

pub fn new_child_scope_with_locals(&self, locals: PyDictRef) -> Scope {
let mut new_locals = Vec::with_capacity(self.locals.len() + 1);
new_locals.push(locals);
new_locals.append(&mut self.locals.clone());
Copy link
Member

@coolreader18 coolreader18 Sep 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
new_locals.append(&mut self.locals.clone());
new_locals.extend_from_slice(&self.locals);
new_locals.shrink_to_fit();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed for extend_from_slice but I don't think I need to call shrink_to_fit as I create the vector with the exact needed capacity.

@windelbouwman windelbouwman merged commit c973ed8 into RustPython:master Sep 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants