-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-82300: Disable resource tracking & auto-destruction of SharedMemory blocks on POSIX #21516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@tiran , This is a duplicate of |
Correction, I mentioned the wrong pull request earlier. |
I would like to ask about this issue to get some more context, because maybe I may be missing something. I have noticed this exact same problem as in issue38119. Is removing the resource tracking entirely from SharedMemory the preferred way forward? I think that there is some benefit to this resource tracking. I'm giving you a heads up that I'm currently working on a patch that resolves this issue and retains the resource tracker. I'll submit it as a PR when it is done. |
We don't actually know what went on in @applio's head when he wrote this patch, and he doesn't seem to respond to email. But, @DamianBarabonkovQC, before you do more work, can you explain what you are planning to do? (Honestly I personally know nothing about the resource tracker, I am just hoping to help find a resolution of one type or another.) Probably the better place to do your write-up would be the in bpo-38119. |
@DamianBarabonkovQC , I would say removing resource tracker is not the best way to go forward, but it is the easiest way to fix some of the major issues plaguing shared_memory. You can see my comment detailing why unix's shared memory will still be inconsistent with windows. Windows uses a reference counting mechanism to cleanup shared memory segment, it only destroys/unlinks the shared memory segment if all the process using it are no longer running. I have mentioned some methods, how we can implement a reference counting mechanism to make unix's implement consistent with windows here. Having said that, there are still major issues in shared_memory which this PR will fix. Consider the scenario mentioned here. |
@vinay0410, to this matter I agree that this PR is a good fix at present. I'm just thinking forward. I like the idea of a reference count at the start of the shared memory segment, for its simplicity and clarity. My solution involves the /tmp directory, but was was mentioned in your post, that might not always work depending on the user's settings for /tmp. |
@DamianBarabonkovQC , Completely agree, I think for now this should be merged to fix some of the critical issues, and then we should have a discussion with core developers about a better and robust solution to tackle this problem. |
Thanks for your input @gvanrossum to this matter. Both Vinay and I agree that for the time being, a temporary patch to this issue would be to simply remove the resource tracking temporarily from the SharedMemory since it is a source of issue with the current implementation. How could we get this patch accepted? Moving forward, I have began a discussion detailing a potential solution here. This would not be very invasive proposal, that should implement resource tracking correctly. I would really appreciate your thoughts on this manner, since maybe my proposal is more invasive than I think, or there is a better approach. |
You need to get someone to review it who actually understand shared memory. Since @tiran hasn't responded to your review request and @applio has not responded to repeated pleas, I'm not sure who to try next (I don't qualify, I don't understand this code at all). Maybe you can email @ericsnowcurrently and get him interested? |
Hi @gvanrossum , Thanks for suggesting possible reviewers. I dropped and email to @ericsnowcurrently 2 weeks ago, but he hasn't opened it yet. Do you have any other suggestions to get this PR reviewed, since this is relatively critical issue in |
reopened to re-trigger CI-CD. |
Next step is a post on python-dev.
On Thu, Aug 27, 2020 at 22:13 Vinay Sharma ***@***.***> wrote:
Hi @gvanrossum <https://github.com/gvanrossum> , Thanks for suggesting
possible reviewers. I dropped and email to @ericsnowcurrently
<https://github.com/ericsnowcurrently> 2 weeks ago, but he hasn't opened
it yet. Do you have any other suggestions to get this PR reviewed, since
this is relatively critical issue in multiprocessing.shared_memory.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21516 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWCWMWFDAGXVTEZHNGCBUTSC44ILANCNFSM4O526PVA>
.
--
--Guido (mobile)
|
…ndle after any one independent process exit.
a061cd5
to
6d09ce7
Compare
Hi @gvanrossum , I posted on python-dev mailing list around a month ago (https://mail.python.org/archives/list/python-dev@python.org/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/) but haven't received any response. |
Hi Vinay,
In a few weeks (Oct 19-23) we'll have the (virtual) core dev sprint. Unlike
other years, Davin Potts hasn't signed up, but I will try to get some other
core dev to help decide and review this. Could you send me a reminder
around the 19th?
…--Guido
On Fri, Oct 2, 2020 at 9:52 PM Vinay Sharma ***@***.***> wrote:
Hi @gvanrossum <https://github.com/gvanrossum> , I posted on python-dev
mailing list around a month ago (
***@***.***/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/)
but haven't received any response.
Are there any other next steps ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21516 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWCWMTDRTTTIQH3PJM6Q3LSI2UYNANCNFSM4O526PVA>
.
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
|
Sure, I will send you a reminder on October 19th. |
Hello Team, |
Hi @gvanrossum , As suggested I am reminding you to take up review of this PR in python core dev sprint. |
@vinay0410 Thanks for the ping. I've heard that @pitrou should be able to help out here. |
The idea of the The For the ref counting at the beginning of the file, the issue is that then, I don't see how to ensure that we detect the resource should be cleaned up if a process holding a reference on it dies. |
@tomMoral , for reference counting at the beginning of the file, one would need resource tracker to ensure that the shared memory segment is destructed as soon as reference count becomes 0, and resource tracker will be responsible for decrementing/incrementing reference count. |
@vinay0410 The issue is with the child processes of the main process. Only the main process is tracked by the Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the Just thinking out loud, maybe another solution would be to have a ref count of the number of |
@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it. Is that what you were also thinking ? Also, I think child process will have their own separate resource trackers, right ? |
Unless something weird happens, the resource tracker is shared accross all descendant of a main process (see here for instance). If a child process creates a new segment, it is also tracked in the same
Actually, the issue is that passing the shared memory segment to a child process is the same as passing it to another process. The name of the segment is send and we create a new |
@tomMoral , should I then start with implementing refcount using resource tracker approach. Are you okay with the approach ? |
I am okay with the approach of ref counting with the Just to summarize and make sure we are on the same page:
This seems to provide a simple enough mechanism to ensure that shared memory segment are not left unattended, while providing the flexibility to have multiple unrelated processes using it. A potential backward compat issue is with cross language application (for instance people using C program to access this shared memory). I don't know if there is a way to ensure that the refcount is not used as part of the shared memory segment. |
@tomMoral , I was thinking of using the following primitives to count in an atomic way. These primitives ensure cross process atomic updates. And these primitives (namely cpython/Include/internal/pycore_atomic.h Line 94 in 74b4eda
|
@vinay0410 as these are low-level C primitives, the refcount seems complicated to implement and maintain but maybe I just don't see how you would use them because the Also, do you think it is possible shift the Note that it might be worth considering using a named semaphore (which is typically designed for such atomic counting). Adding at the end of the file the name of a semaphore and the length of the name to be able to read it. This would allow to implement everything with python ( |
It would be easy enough to expose Python wrappers for the atomic intrinsics in the Note that you need atomic increments and decrements, not mere loads and stores. This is provided in a portable way by the remarkable portable-snippets library: https://github.com/nemequ/portable-snippets/tree/master/atomic
That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot). cc @aeros, by the way. |
@pitrou ,
That's correct I am planning to use,
In cpython/Lib/multiprocessing/shared_memory.py Line 180 in db6434c
And then replace cpython/Lib/multiprocessing/shared_memory.py Line 204 in db6434c
Finally I will pass self._ref_count to python functions which will atomically increment or decrement refcount implemented using c builtins. |
Hi @pitrou and @tomMoral , I have raised a PR #23174 , implementing the above solution. Also, I have not updated tests and documentation yet. Because, I just wanted to get some initial feedback on the approach I am using to implement the above solution. I have added functions |
I'm not strongly familiar enough with |
Hi @pitrou and @tomMoral , A pre-eliminary review of PR #23174 , which implements the above discussed reference counting mechanism would be very helpful. Once the current approach of implementation is okay with everyone, I will get on to writing tests. Thanks. |
Related with - "resource tracker destroys shared memory segments when other processes should still have valid access" https://bugs.python.org/issue38119 - python pr Fix shmem resource tracking python/cpython#21516 The monkey-patch solution it's due Álvaro Justen (@turicas) (https://bugs.python.org/msg388287)
Related with - "resource tracker destroys shared memory segments when other processes should still have valid access" https://bugs.python.org/issue38119 - python pr Fix shmem resource tracking python/cpython#21516 The monkey-patch solution it's due Álvaro Justen (@turicas) (https://bugs.python.org/msg388287)
Related with - "resource tracker destroys shared memory segments when other processes should still have valid access" https://bugs.python.org/issue38119 - python pr Fix shmem resource tracking python/cpython#21516 The monkey-patch solution it's due Álvaro Justen (@turicas) (https://bugs.python.org/msg388287)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usual multiprocessing tests:
Ran 255 tests in 149.271s
OK (skipped=26)
Ran 39 tests in 24.086s
OK
Ran 255 tests in 66.529s
OK (skipped=27)
Ran 255 tests in 79.264s
OK (skipped=23)
But they dont include the added test here, so running _test_multiprocessing.py
Ran 35 tests in 21.616s
OK (skipped=2)
Looks ok.
I thought if you wanted management for shared memory you'd use a SharedMemoryManager. Does that not already solve the problem of memory cleanup? |
Any updates on when this PR will be merged? |
Closing - we went with adding a |
Eliminates the use of the resource tracker on shared memory segments but preserves its use on semaphores, etc. where it is appropriately applied.
Adds a test that fails with the current code (an independent Python process started via subprocess accesses shared memory created by a different Python process, then exits normally, expecting the original creator process to continue to be able to access that shared memory) but passes with this patch to defend against future regressions.
https://bugs.python.org/issue38119