-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/map: Add board option MICROPY_PY_MAP_ORDERED to make all dicts/maps ordered. #5323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Hmm, interesting. I guess this really goes with the CPython 3.7 change that Although there is a subtly that maps are not 1:1 with dict, map is used to implement dict. Not sure if that really matters though.
Would be interesting to test it. It would also mean that most of the code in |
Yes I know I kind of glossed over the terminology of maps vs dicts, I used the dict label in the title as ordering them is what I wanted to achieve, but that's done via the map code. I just looked at |
@dpgeorge I actually think it might be slightly faster with this new ordered dict setting enabled. I grabbed the list of dict related tests in the micropython repo and imported (run) them all from a top level file from basics import dict1
from basics import dict2
from basics import dict_clear
from basics import dict_construct
from basics import dict_copy
from basics import dict_del
from basics import dict_fixed
from basics import dict_from_iter
from basics import dict_fromkeys
from basics import dict_fromkeys2
from basics import dict_get
from basics import dict_intern
from basics import dict_iterator
from basics import dict_pop
from basics import dict_popitem
from basics import dict_setdefault
from basics import dict_specialmeth
from basics import dict_update
from basics import dict_views
from basics import namedtuple_asdict
from basics import object_dict
from basics import ordereddict1
from basics import ordereddict_eq
from micropython import heapalloc_fail_dict
from stress import dict_copy
from stress import dict_create
from stress import dict_create_max Then made a quick runner to execute this in a loop import time
import subprocess
start = time.time()
for _ in range(100):
subprocess.run(['micropython', 'dict_test.py'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
end = time.time()
print(end-start) Then I found that the print to console slowed it down by a few orders of magnitude, hence the capture to pipe. My normal build of unix micropython with this new setting disabled run in ~4.3 seconds (4.23s to 4.6s over 10 runs) Recompiled with the ordered setting enabled, ~4.1 seconds (4.03s to 4.17s) over the same number of runs. I switched the setting on and off a couple of times and re-ran to ensure the change is consistent, and yeah it kept coming up ~200ms faster with ordered turned on permanently. |
Interesting that it's consistent, still it would be better to do the profiling in the MicroPython process to rule out whatever happens when launching the process, and also change it so it doesn't take the importing account but purely the dict operations; even then I'm not sure how close the tests are to real-life software using dict so I'd also try a couple of the benchmark tests which are available on the internet and run those. Would be nice if that gives the same results! |
Interesting that the performance looks better... but I'm not sure this would be true for all uses of dict, since large dicts definitely benefit from a proper O(1) hashed lookup, rather than a brute force O(N) search. Running the performance benchmark suite on PYBv1.0 I get:
So almost all benchmarks improve by a few percent. But pystone decreases, and that is one which uses maps quite a bit (in user classes, for their instance member table). I think the reason for the improvements in performance is because |
Ah yes, the qstr lookup time does make sense. Certainly the algorithm scaling blow out the difference for larger dicts. I re-did my test by concatenating most of the dict unit tests together an running them in a x100 loop directly inside micropython. With most of the tests in place: However, if I enable # copying a large dictionary
a = {i:2*i for i in range(1000)}
b = a.copy()
for i in range(1000):
dprint(i, b[i])
dprint(len(b))
# create a large dictionary
d = {}
x = 1
while x < 1000:
d[x] = x
x += 1
dprint(d[500]) in the x100 loop then the difference jumps to And then the big killer, running just once (on unix): # The aim with this test is to hit the maximum resize/rehash size of a dict,
# where there are no more primes in the table of growing allocation sizes.
# This value is 54907 items.
script_end = utime.time()
print("Elapsed %0.4f ms" % (script_end-script_start))
d = {}
try:
for i in range(54908):
d[i] = i
except MemoryError:
pass master: 0.0086 s So while for smaller application, the simpler brute force ordered implementation increases efficiency and could be used to chop out some code size, for many other larger applications the ordered approach would likely be unusable. |
I've just expanded the #define's to chop out the hash based code that's no longer used if ordered is enabled. |
Would there be any merit to implementing the compact/sparse array/map that CPython and PyPy adopted for Python 3.6 back in 2015? When considering how we might keep our dicts working API similar to CPython perhaps keeping the implementation similar makes that easier. |
Maybe. I would suggest thinking about a compile-time option that selects between the current dict implementation (efficient in RAM usage and O(1) lookup, but not ordered) and one that uses more RAM with O(1) lookup and ordered. |
Personally I feel improvements to the ordered dict implementation is outside the scope of this particular PR, I was just going for ordered by default, irrespective of performance. If a new hashmap implementation is to be written to replace the current ordered implementation, I would think this PR is still equally relevant as-is to select between the two. Choosing a new ordered dict implementation is a larger job as there are multiple factors to choose between. I personally would lean towards the current pypy hashmap: https://morepypy.blogspot.com/2015/01/faster-more-memory-efficient-and-more.html
Basically I trust their benchmarks and test results more than my own abilities to implement something more efficient/suitable. But again, I think this is a larger discussion than this PR intention to just force ordered by default. |
ccb7dbe
to
011b96c
Compare
80e1681
to
340740c
Compare
if (map->is_ordered) { | ||
for (mp_map_elem_t *elem = &map->table[0], *top = &map->table[map->used]; elem < top; elem++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this uncrustify which wants to dedent all the code? It's not great. Would it fix it to add an extra set of { }
around this code, outside the #if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep these were both from uncrustify breaking the build. I'll try the { }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, yes the brackets appear to have fixed the formatting.
py/map.c
Outdated
|
||
if (map->alloc == 0) { | ||
if (map->alloc == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code shouldn't be at column 1.
Added support for Challenger RP2040 WiFi
No change to normal builds, but if
MICROPY_PY_MAP_ORDERED
is set in port/board config then the ordered flag is set for all new maps (dicts
)While I presume this has a performance impact it is useful behavior for some applications eg. #5322