Skip to content

gh-82300: Disable resource tracking & auto-destruction of SharedMemory blocks on POSIX #21516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

vinay0410
Copy link
Contributor

@vinay0410 vinay0410 commented Jul 17, 2020

Eliminates the use of the resource tracker on shared memory segments but preserves its use on semaphores, etc. where it is appropriately applied.

Adds a test that fails with the current code (an independent Python process started via subprocess accesses shared memory created by a different Python process, then exits normally, expecting the original creator process to continue to be able to access that shared memory) but passes with this patch to defend against future regressions.

https://bugs.python.org/issue38119

@vinay0410
Copy link
Contributor Author

vinay0410 commented Jul 17, 2020

@tiran , This is a duplicate of #16139 #15989 , which had conflicts, so I fixed them and opened this.
These changes will also fix following issues:
https://bugs.python.org/issue39959

@gvanrossum
Copy link
Member

Davin (@applio) can you review and approve this? What was the hold-up with your own fix? (This is a copy that fixes the merge conflicts.) Or alternately fix the conflicts in #16139 so that can be merged?

@vinay0410
Copy link
Contributor Author

Correction, I mentioned the wrong pull request earlier.
This is a duplicate of #15989

@DamianBarabonkovQC
Copy link

I would like to ask about this issue to get some more context, because maybe I may be missing something. I have noticed this exact same problem as in issue38119. Is removing the resource tracking entirely from SharedMemory the preferred way forward?

I think that there is some benefit to this resource tracking. I'm giving you a heads up that I'm currently working on a patch that resolves this issue and retains the resource tracker. I'll submit it as a PR when it is done.

@gvanrossum
Copy link
Member

gvanrossum commented Aug 3, 2020

We don't actually know what went on in @applio's head when he wrote this patch, and he doesn't seem to respond to email. But, @DamianBarabonkovQC, before you do more work, can you explain what you are planning to do? (Honestly I personally know nothing about the resource tracker, I am just hoping to help find a resolution of one type or another.) Probably the better place to do your write-up would be the in bpo-38119.

@vinay0410
Copy link
Contributor Author

@DamianBarabonkovQC , I would say removing resource tracker is not the best way to go forward, but it is the easiest way to fix some of the major issues plaguing shared_memory. You can see my comment detailing why unix's shared memory will still be inconsistent with windows.

Windows uses a reference counting mechanism to cleanup shared memory segment, it only destroys/unlinks the shared memory segment if all the process using it are no longer running.

I have mentioned some methods, how we can implement a reference counting mechanism to make unix's implement consistent with windows here.

Having said that, there are still major issues in shared_memory which this PR will fix. Consider the scenario mentioned here.
A similar scenario has also been mentioned here.

@DamianBarabonkovQC
Copy link

@vinay0410, to this matter I agree that this PR is a good fix at present. I'm just thinking forward.

I like the idea of a reference count at the start of the shared memory segment, for its simplicity and clarity. My solution involves the /tmp directory, but was was mentioned in your post, that might not always work depending on the user's settings for /tmp.

@vinay0410
Copy link
Contributor Author

@DamianBarabonkovQC , Completely agree, I think for now this should be merged to fix some of the critical issues, and then we should have a discussion with core developers about a better and robust solution to tackle this problem.

@DamianBarabonkovQC
Copy link

Thanks for your input @gvanrossum to this matter. Both Vinay and I agree that for the time being, a temporary patch to this issue would be to simply remove the resource tracking temporarily from the SharedMemory since it is a source of issue with the current implementation. How could we get this patch accepted?

Moving forward, I have began a discussion detailing a potential solution here. This would not be very invasive proposal, that should implement resource tracking correctly.

I would really appreciate your thoughts on this manner, since maybe my proposal is more invasive than I think, or there is a better approach.

@gvanrossum
Copy link
Member

You need to get someone to review it who actually understand shared memory. Since @tiran hasn't responded to your review request and @applio has not responded to repeated pleas, I'm not sure who to try next (I don't qualify, I don't understand this code at all). Maybe you can email @ericsnowcurrently and get him interested?

@vinay0410
Copy link
Contributor Author

Hi @gvanrossum , Thanks for suggesting possible reviewers. I dropped and email to @ericsnowcurrently 2 weeks ago, but he hasn't opened it yet. Do you have any other suggestions to get this PR reviewed, since this is relatively critical issue in multiprocessing.shared_memory.

@vinay0410 vinay0410 closed this Aug 28, 2020
@vinay0410
Copy link
Contributor Author

reopened to re-trigger CI-CD.

@vinay0410 vinay0410 reopened this Aug 28, 2020
@gvanrossum
Copy link
Member

gvanrossum commented Aug 28, 2020 via email

@vinay0410 vinay0410 force-pushed the fix_shmem_resource_tracking branch from a061cd5 to 6d09ce7 Compare August 28, 2020 06:22
@vinay0410
Copy link
Contributor Author

Hi @gvanrossum , I posted on python-dev mailing list around a month ago (https://mail.python.org/archives/list/python-dev@python.org/message/O67CR7QWEOJ7WDAJEBKSY74NQ2C4W3AI/) but haven't received any response.
Are there any other next steps ?

@gvanrossum
Copy link
Member

gvanrossum commented Oct 3, 2020 via email

@vinay0410
Copy link
Contributor Author

Sure, I will send you a reminder on October 19th.

@lthakur007
Copy link

Hello Team,
May I know when will this PR get merged?
Thank You!

@aeros aeros requested a review from pitrou October 18, 2020 22:20
@vinay0410
Copy link
Contributor Author

Hi @gvanrossum , As suggested I am reminding you to take up review of this PR in python core dev sprint.

@gvanrossum
Copy link
Member

@vinay0410 Thanks for the ping. I've heard that @pitrou should be able to help out here.

@tomMoral
Copy link
Contributor

tomMoral commented Oct 20, 2020

The idea of the resource_tracker is to add a safety net for shared_memory. Correct clean up cannot always happen, either because processes can die abruptly or users don't realize they need to clean up. For long running machines -- such as server, or computing cluster -- these leaked resources can pile up and become a real issue. Moreover, this might be hard to debug, for instance if someone is having an error for shared memory full while it is caused by other users leaked ressources.

The resource_tracker avoids this by maybe being overly conservative but I don't think removing it completely is the right option. IMO, @pitrou's suggestion of making this configurable is probably the only way to easily allow both behaviours, while explicitly leaving the duty of cleaning the resource to the user -- adding an option that bypass the resource_tracker for users that wants to manage the shared memory them-self while still being conservative in the default case.

For the ref counting at the beginning of the file, the issue is that then, I don't see how to ensure that we detect the resource should be cleaned up if a process holding a reference on it dies.

@vinay0410
Copy link
Contributor Author

@tomMoral , for reference counting at the beginning of the file, one would need resource tracker to ensure that the shared memory segment is destructed as soon as reference count becomes 0, and resource tracker will be responsible for decrementing/incrementing reference count.

@tomMoral
Copy link
Contributor

@vinay0410 The issue is with the child processes of the main process.

Only the main process is tracked by the resource_tracker. So if any of the child processes die without decrementing the ref count, how does the resource tracker knows how many child processes have incremented the ref count?

Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the resource_tracker that died -- but this is by no mean easy and can lead to resource leak or early destruction quite easily. So not sure if it is a solution that can work.

Just thinking out loud, maybe another solution would be to have a ref count of the number of resource_tracker that have a reference to the shared_memory? This would group the processes per block (main process and descendant), without changing the behaviour if there is only one block and having a clean up mechanism aware of the multiple block which would be tracked at the block level -- clean up (decref/unlink) once the main process dies.

@vinay0410
Copy link
Contributor Author

vinay0410 commented Oct 20, 2020

Just thinking out loud, maybe another solution would be to have a ref count of the number of resource_tracker that have a reference to the shared_memory? This would group the processes per block (main process and descendant), without changing the behaviour if there is only one block and having a clean up mechanism aware of the multiple block which would be tracked at the block level -- clean up (decref/unlink) once the main process dies.

@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it.

Is that what you were also thinking ?

Also, I think child process will have their own separate resource trackers, right ?
Because shared memory segments created by child processes are still being tracked (I think).

@tomMoral
Copy link
Contributor

tomMoral commented Oct 20, 2020

Also, I think child process will have their own separate resource trackers, right ?
Because shared memory segments created by child processes are still being tracked (I think).

Unless something weird happens, the resource tracker is shared accross all descendant of a main process (see here for instance). If a child process creates a new segment, it is also tracked in the same resource_tracker.

@tomMoral I think I concur. If the shared memory segment is passed to the child process, the reference count shouldn't be increased and it shouldn't be decreased when the child process dies. So basically, the reference count of a shared memory segment can only be increased or decreased by a process which attaches to it directly or creates it.

Is that what you were also thinking ?

Actually, the issue is that passing the shared memory segment to a child process is the same as passing it to another process. The name of the segment is send and we create a new SharedMemory object. We cannot have different behavior between these 2. IMO, incrementing the refcount should be done by the ressource_tracker. This ensures that we have only one reference count per "group of processes" as the resource_tracker is shared by all these processes. And if a register event is sent for the same segment name, the refcount won't be increased.

@vinay0410
Copy link
Contributor Author

@tomMoral , should I then start with implementing refcount using resource tracker approach. Are you okay with the approach ?

@tomMoral
Copy link
Contributor

tomMoral commented Nov 3, 2020

I am okay with the approach of ref counting with the resource_tracker.

Just to summarize and make sure we are on the same page:

  • When creating a SharedMemory segment with create=True, reserve the space for an integer that will be the ref count.
  • Whenever a SharedMemory is instantiated, call register in the resource_tracker.
  • The resource_tracker handles incrementing the refcount and it only increments it when the name of the shared memory segment is unknown (once per group of processes). Not sure of which primitive can be used to count in an atomic way here? An option is to use a semaphore file (store the name a the beginning of the shm segment) but it feels wasteful.
  • When a group of process dies, if the segment still exists (it can have been unlinked by a direct call in another process), the refcount is decremented and if it reaches 0, the resource_tracker calls unlink on the segment.

This seems to provide a simple enough mechanism to ensure that shared memory segment are not left unattended, while providing the flexibility to have multiple unrelated processes using it.

A potential backward compat issue is with cross language application (for instance people using C program to access this shared memory). I don't know if there is a way to ensure that the refcount is not used as part of the shared memory segment.

@vinay0410
Copy link
Contributor Author

vinay0410 commented Nov 3, 2020

Not sure of which primitive can be used to count in an atomic way here? An option is to use a semaphore file (store the name a the beginning of the shm segment) but it feels wasteful.

@tomMoral , I was thinking of using the following primitives to count in an atomic way.
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

These primitives ensure cross process atomic updates. And these primitives (namely __atomic_store_n and __atomic_load_n) are already available in the python source code, used for implementing GIL.

__atomic_store_n(&((ATOMIC_VAL)->_value), NEW_VAL, ORDER))

@tomMoral
Copy link
Contributor

tomMoral commented Nov 3, 2020

@vinay0410 as these are low-level C primitives, the refcount seems complicated to implement and maintain but maybe I just don't see how you would use them because the resource_tracker is implemented in pure python. What is your idea to call these primitive?

Also, do you think it is possible shift the mmap or the memoryview by a certain number of bytes to avoid writing on the refcount by mistake if it is at the beginning? Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size. Then the only danger comes from the fact that a C-program could very well overwrite the refcount but it is low level stuff so probably okay.

Note that it might be worth considering using a named semaphore (which is typically designed for such atomic counting). Adding at the end of the file the name of a semaphore and the length of the name to be able to read it. This would allow to implement everything with python (Semaphore and pickle). I can try to do a basic implem if you want.

@pitrou
Copy link
Member

pitrou commented Nov 3, 2020

as these are low-level C primitives, the refcount seems complicated to implement and maintain

It would be easy enough to expose Python wrappers for the atomic intrinsics in the _multiprocessing module, IMHO.

Note that you need atomic increments and decrements, not mere loads and stores. This is provided in a portable way by the remarkable portable-snippets library: https://github.com/nemequ/portable-snippets/tree/master/atomic

Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size.

That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot).

cc @aeros, by the way.

@vinay0410
Copy link
Contributor Author

@pitrou ,

Note that you need atomic increments and decrements, not mere loads and stores.

That's correct I am planning to use, __atomic_add_fetch and __atomic_sub_fetch.

Or maybe the refcount should be put at the end of the segment as the mmap and memoryview can be restricted to smaller size.

That sounds like the best solution, since it would maintain alignment (though I'm not sure it matters a lot).

In

self._buf = memoryview(self._mmap)
, I was hoping to replace this with.

self._buf = memoryview(self._mmap)
self._ref_count = self._buf[0:4]
self._public_buf = self._buf[4:]

And then replace self._buf with self._public_buf here.

Finally I will pass self._ref_count to python functions which will atomically increment or decrement refcount implemented using c builtins.
cc: @tomMoral

@vinay0410
Copy link
Contributor Author

vinay0410 commented Nov 6, 2020

Hi @pitrou and @tomMoral , I have raised a PR #23174 , implementing the above solution. Also, I have not updated tests and documentation yet. Because, I just wanted to get some initial feedback on the approach I am using to implement the above solution.

I have added functions _posixshmem.shm_inc_refcount(memoryview) and _posixshmem.shm_dec_refcount(memorview). To atomically increment and decrement integer referenced by the passed memorview object.

@aeros
Copy link
Contributor

aeros commented Nov 8, 2020

@tomMoral

The issue is with the child processes of the main process.

Only the main process is tracked by the resource_tracker. So if any of the child processes die without decrementing the ref count, how does the resource tracker knows how many child processes have incremented the ref count?

Maybe it would be possible by having an internal ref count in the resource tracker - allowing to decrement the ref count by the number of process connected to the resource_tracker that died -- but this is by no mean easy and can lead to resource leak or early destruction quite easily. So not sure if it is a solution that can work.

I'm not strongly familiar enough with resource_tracker to know for certain if it would work, but one possible way to handle this could be via a SIGCHLD handler that decrements the internal refcount whenever a child processes dies or is stopped. Some precaution might have to be taken though to ensure that if a handler is already set by an external program that it's restored after all of the child processes have been accounted for.

@vinay0410
Copy link
Contributor Author

Hi @pitrou and @tomMoral , I have raised a PR #23174 , implementing the above solution. Also, I have not updated tests and documentation yet. Because, I just wanted to get some initial feedback on the approach I am using to implement the above solution.

I have added functions _posixshmem.shm_inc_refcount(memoryview) and _posixshmem.shm_dec_refcount(memorview). To atomically increment and decrement integer referenced by the passed memorview object.

Hi @pitrou and @tomMoral , A pre-eliminary review of PR #23174 , which implements the above discussed reference counting mechanism would be very helpful. Once the current approach of implementation is okay with everyone, I will get on to writing tests.

Thanks.

devmessias added a commit to devmessias/fury that referenced this pull request Jun 16, 2021
Related with
 - "resource tracker destroys shared memory segments when other processes should still have valid access"
   https://bugs.python.org/issue38119
 - python pr Fix shmem resource tracking
   python/cpython#21516

The monkey-patch solution it's due Álvaro Justen (@turicas)
(https://bugs.python.org/msg388287)
devmessias added a commit to devmessias/fury that referenced this pull request Jul 30, 2021
Related with
 - "resource tracker destroys shared memory segments when other processes should still have valid access"
   https://bugs.python.org/issue38119
 - python pr Fix shmem resource tracking
   python/cpython#21516

The monkey-patch solution it's due Álvaro Justen (@turicas)
(https://bugs.python.org/msg388287)
devmessias added a commit to devmessias/fury that referenced this pull request Apr 5, 2022
Related with
 - "resource tracker destroys shared memory segments when other processes should still have valid access"
   https://bugs.python.org/issue38119
 - python pr Fix shmem resource tracking
   python/cpython#21516

The monkey-patch solution it's due Álvaro Justen (@turicas)
(https://bugs.python.org/msg388287)
Copy link
Contributor

@MaxwellDupre MaxwellDupre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usual multiprocessing tests:
Ran 255 tests in 149.271s
OK (skipped=26)
Ran 39 tests in 24.086s
OK
Ran 255 tests in 66.529s
OK (skipped=27)
Ran 255 tests in 79.264s
OK (skipped=23)

But they dont include the added test here, so running _test_multiprocessing.py
Ran 35 tests in 21.616s
OK (skipped=2)
Looks ok.

@Sxderp
Copy link

Sxderp commented Oct 24, 2022

I thought if you wanted management for shared memory you'd use a SharedMemoryManager. Does that not already solve the problem of memory cleanup?

@satishskamath
Copy link

Any updates on when this PR will be merged?

@gpshead gpshead marked this pull request as draft November 30, 2023 08:28
@gpshead gpshead changed the title bpo-38119: Fix shmem resource tracking bpo-38119: Disable resource tracking & auto-destruction of SharedMemory blocks on POSIX Nov 30, 2023
@gpshead gpshead changed the title bpo-38119: Disable resource tracking & auto-destruction of SharedMemory blocks on POSIX gh-82300: Disable resource tracking & auto-destruction of SharedMemory blocks on POSIX Nov 30, 2023
@gpshead
Copy link
Member

gpshead commented Dec 5, 2023

Closing - we went with adding a track=True parameter to SharedMemory to at least allow users to disable the tracking when necessary. If you believe there is a case to be made for changing the default after a deprecation period, discuss that on the issue.

@gpshead gpshead closed this Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.