You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In python, using b = a does not copy the content of a.
>>>a=ndtest(3)
>>>b=a>>>b['a1'] =0>>>a['a1']
0
The solution is to use b = a.copy() but if users use it everywhere even when not strictly necessary (and determining this is not always obvious) would consume memory and cpu needlessly.
To eliminate this problem, it would not be too hard to tell users to always use .copy() but in .copy() only flag the resulting array as "must_copy_on_write", without actually copying the data right away. Then if (and only if) the user later modifies the copy (using setitem), an actual copy is made and the array is flagged as must_copy_on_write=False before the setitem is done.
The text was updated successfully, but these errors were encountered:
So this issue has an impact on performance but it is first and foremost about correctness. We cannot do much about the b = a case above, but we can fix the subset issue.
PS: I am unsure whether Pandas plans to make an explicit .copy() call copy-on-write too and it is probably a good idea we align our behaviour on what Pandas does.
In python, using b = a does not copy the content of a.
The solution is to use b = a.copy() but if users use it everywhere even when not strictly necessary (and determining this is not always obvious) would consume memory and cpu needlessly.
To eliminate this problem, it would not be too hard to tell users to always use .copy() but in .copy() only flag the resulting array as "must_copy_on_write", without actually copying the data right away. Then if (and only if) the user later modifies the copy (using setitem), an actual copy is made and the array is flagged as must_copy_on_write=False before the setitem is done.
The text was updated successfully, but these errors were encountered: