-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Deprecate Process Child Watchers #82772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Proposal: Motivation: Although Andrew suggested the idea of deprecating the process watchers subsystem entirely, I think that it would be adequate to start with just deprecating MultiLoopChildWatcher and proceeding from there. This is because the other three watcher implementations are significantly more stable in comparison (based on number of issues), have less complex implementations, and have far more widespread usage across public repositories. I would not be opposed to deprecating the other alternative watchers as well, but MultiLoopChildWatcher likely has the most justification for removal. Details: Current unresolved MultiLoopChildWatcher issues: (3 out of the 5 open process watcher issues are caused by MultiLoopChildWatcher) GitHub code usage comparison: Note that for the above results, it also includes matches in the CPython repository and for repositories that have exact copies of Lib/asyncio/unix_events.py. Also, ThreadedChildWatcher likely has significantly less results since it's already the default watcher returned by asyncio.get_child_watcher(), so there's less need to explicitly declare it compared to the others. There are of course private repositories and non-GitHub repositories this doesn't include, but it should provide a general idea of overall usage. My experience in interacting with asyncio users is similar to the results. The process watchers subsystem seems to be minimally used compared to the rest of asyncio, with ThreadedChildWatcher likely being the most used as the default returned from I would be interested in working on this issue if it is approved, as I think it would provide a significant long term reduction in maintenance for asyncio; allowing us to focus on the improvement and development of other features that benefit a far larger audience. |
Didn't we just add MultiLoopChildWatcher in 3.8? |
Yep, that's correct: https://docs.python.org/3/library/asyncio-policy.html#asyncio.MultiLoopChildWatcher I wasn't aware of that, but it would explain the current lack of usage. I'm generally in favor of Andrew's idea to deprecate the watchers subsystem, but perhaps if we go through with that we should remove MultiLoopChildWatcher from 3.8 (if it's not too late to do so) instead of deprecating it. |
I'll leave that up to Andrew to decide, but I'd be +1 to drop it asap, especially if we want to eventually deprecate watchers. Speaking of watchers -- big +1 from me to drop them all at some point. I would start as early as 3.9. Linux has pidfd now, freebsd/macos has kqueue, windows has its own apis for watching processes. Threads can be the backup method for OSes that lack proper APIs for watching multiple processes (without using SIGCHLD etc). |
Yeah that was my initial plan, to start the deprecation in 3.9 and finalize the removal in 3.11. We might be able to make an exception for MultiLoopChildWatcher and just remove it from 3.8 though. I think it will be necessary to fix the issues in the near future if we want to keep MultiLoopChildWatcher in 3.8 (as they apply to 3.8 and 3.9). I would certainly not be ecstatic about the idea of removing something that was just recently added (similar to how we had to revert the new streaming changes due to API design), but I'm even less in favor of keeping something around that's not stable in a final release. Based on my investigation of https://bugs.python.org/issue38323, it seems like it will be a rather complex issue to solve. |
Kyle, why are you resetting the status to "Pending"?
Your opinion on this is duly noted. It would be great to hear what Andrew and Victor think about this. |
That was an accident, I think I had that set already on the page and submitted my comment just after you did yours. I'm going to change the title of the issue to "Deprecate Process Child Watchers". I started the proposal bit more conservatively because I thought that it might be easier to just deprecate MultiLoopChildWatcher at first. But seeing as it was just added in 3.8, I don't think it would make as much sense to deprecate that one by itself. |
ThreadedChildWatcher starts a thread per process but has O(1) complexity. MultiLoopChildWatcher doesn't spawn threads, it can be used safely with asyncio loops spawn in multiple threads. The complexity is O(N) plus no other code should contest for SIG_CHLD subscription. FastChildWatcher has O(1), this is the good news. All others are bad: the watcher conflicts even with blocking SafeChildWatcher is safer than FastChildWatcher but working with asyncio subprocess API is still super complicated if asyncio code is running from multiple threads. SafeChildWatcher works well only if asyncio is run from the main thread only. Complexity is O(N). I think FastChildWatcher and SafeChildWatcher should go, ThreadedChildWatcher should be kept default and MultiLoopChildWatcher is an option where ThreadedChildWatcher is not satisfactory. MultiLoopChildWatcher problems can and should be fixed; there is nothing bad in the idea but slightly imperfect implementation. Regarding pidfd and kqueue -- I love to see pull requests with proposals. Now nothing exists. |
Okay, I think I can understand the reasoning here. Do you think that FastChildWatcher and SafeChildWatcher could be deprecated starting in 3.9 and removed in 3.11? If so, I'd be glad to start working on adding the deprecation warnings and the 3.9 Whats New entries.
Yeah my largest concern was just that the current issues seem especially complex to fix and I interpreted your previous comment as "I'd like to deprecate all the process watchers in the near future". Thus, it seemed to make sense to me that we could start removing them rather than sinking time into fixing something that might be removed soon. By "slightly imperfect implementation", do you have any ideas for particularly imperfect parts of it that could use improvement? I feel that I've developed a decent understanding of the implementation for ThreadedChildWatcher, but after looking at the race conditions for MultiLoopChildWatcher in https://bugs.python.org/issue38323, I'll admit that I felt a bit lost for where to find a solution. Primarily because my understanding of the signal module is quite limited in comparison to others; it's not an area that I've strongly focused on. |
But it spawns a new Python thread per process which can be a blocker issue if a server memory is limited. What if you want to spawn 100 processes? Or 1000 processes? What is the memory usage? I like FastChildWatcher!
Well, I like the ability to choose the child watcher implementation depending on my use case. If asyncio is only run from the main thread, FastChildWatcher is safe, fast and has low memory footprint, no? |
I hope that it will take less time to expose pidfd_open() in Python! It is *already* available in the Linux kernel 5.3. My laptop is already running Linux 5.3! (Thanks Fedora 30.) I would prefer to keep an API to choose the child watcher since it seems like even in 2019, there are still new APIs (pidfd) to wait for a process completion. I expect that each implementation will have advantages and drawbacks. |
My non-LTS Ubuntu also has 5.3 kernel but I'm talking about the oldest supported RHEL/CentOS. That's why pidfd_open() cannot be a single implementation. It's so new; my local man-pages system has not a record about the API yet (but the web has: http://man7.org/linux/man-pages/man2/pidfd_open.2.html).
Unfortunately, no. FastChildWatcher is safe if you can guarantee that no code executed in asyncio main thread AND thread pools spawn subprocesses. Otherwise, the whole Python process becomes broken by the race condition between FastChildWatcher and any other wait()/waitpid()/waitid() call. |
Am I misunderstanding something here or is this supposed to be "FastChildWatcher is safe if you can guarantee that no code executed *outside of* the asyncio main thread AND ..."? Alternatively, "FastChildWatcher is safe if you can guarantee that code *only* executed in the asyncio main thread". Both of the above have the same functional meaning. I think it was a typo, but I just wanted to make sure because the distinction from the original makes a functional difference in this case. Also, regarding the second part "thread pools spawn subprocesses", is that to say that subprocesses can only spawn within the thread pools? As in, FastChildWatcher becomes unsafe if subprocesses are spawned from anywhere else? These answers may be fairly obvious to someone familiar working within the internals of FastChildWatcher, but it may not be overly clear to someone such as myself who has mostly just read through the documentation and looked over the implementation briefly. I'm only familiar with the internals of ThreadedChildWatcher. |
I understand that there's *some* overhead associated with spawning a new thread, but from my impression it's not substantial enough to make a significant impact in most cases. Each individual instance of threading.Thread is only 64 bytes. Have you seen any recent cases where the server memory is limited enough for the memory cost associated with having to spawn an additional thread per subprocess becomes the limiting factor? |
FWIW, I started implementing a pidfd-based child process watcher over on bpo-38692. |
Although I think this still stands to some degree, I will have to rescind the following:
The 64 bytes was measured by In order to get a better estimate, I implemented a custom get_size() function, that recursively adds the size of the object and all unique objects from gc.get_referents() (ignoring several redundant and/or unnecessary types). For more details, see https://gist.github.com/aeros/632bd035b6f95e89cdf4bb29df970a2a. Feel free to critique it if there are any apparent issues (for the purpose of measuring the size of threads). Then, I used this function on three different threads, to figure how much memory was needed for each one: Python 3.8.0+ (heads/3.8:1d2862a323, Nov 4 2019, 06:59:53)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> a = threading.Thread()
>>> b = threading.Thread()
>>> c = threading.Thread()
>>> get_size(a)
3995
>>> get_size(b)
1469
>>> get_size(c)
1469 1469 bytes seems to be roughly the amount of additional memory required for each new thread, at least on Linux kernel 5.3.8 and Python 3.8. I don't know if this is 100% accurate, but it at least provides an improved estimate over sys.getsizeof().
From my understanding, ~1.5KB/thread seems to be quite negligible for most modern equipment. The server's memory would have to be very limited for spawning an additional 1000 threads to be a bottleneck/blocker issue: Python 3.8.0+ (heads/3.8:1d2862a323, Nov 4 2019, 06:59:53)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> threads = []
>>> for _ in range(1000):
... th = threading.Thread()
... threads.append(th)
...
>>> get_size(threads)
1482435 (~1.5MB) Victor (or anyone else), in your experience, would the additional ~1.5KB per process be an issue for 99% of production servers? If not, it seems to me like the additional maintenance cost of keeping SafeChildWatcher and FastChildWatcher in asyncio's API wouldn't be worthwhile. |
You have to account also for the thread stack size. I suggest to look at RSS memory instead. |
I measured the RSS memory per thread: it's around 13.2 kB/thread. Oh, that's way lower than what I expected. Script: # 10: 9636 kB => 9756 kB: +12 kB/thread import os
import threading
def display_rss():
os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
def wait(event):
event.wait()
class Thread(threading.Thread):
def __init__(self):
super().__init__()
self.stop_event = threading.Event()
self.started_event = threading.Event()
def run(self):
self.started_event.set()
self.stop_event.wait()
def stop(self):
self.stop_event.set()
self.join()
def main():
nthread = 1000
display_rss()
threads = [Thread() for i in range(nthread)]
for thread in threads:
thread.start()
for thread in threads:
thread.started_event.wait()
display_rss()
for thread in threads:
thread.stop()
main() |
To be clear: I mean that FastChildWatcher is safe only if all process's code spaws subprocesses by FastChildWatcher. I just think that this particular watcher is too dangerous. |
Ah, good point. I believe get_size() only accounts for the memory usage of the thread object, not the amount allocated in physical memory from the thread stack. Thanks for the clarification.
On Python 3.8 and Linux kernel 5.3.8, I received the following result: # Starting mem ~13224kB for 1k threads, which reflects the ~13.2kB/thread estimate. Also, as a sidenote, I think you could remove the "for thread in threads: thread.started_event.wait()" portion for our purposes. IIUC, waiting on the threading.Event objects wouldn't affect memory usage.
So are we at least in agreement for starting with deprecating FastChildWatcher? If a server is incredibly tight on memory and it can't spare ~13.2kB/thread, SafeChildWatcher would be an alternative to ThreadedChildWatcher. Personally, I still think this amount is negligible for most production servers, and that we can reasonably deprecate SafeChildWatcher as well. But I can start with FastChildWatcher. |
I think so. It will take a long before we remove it though. |
In that case, it could be a long term deprecation notice, where we start the deprecation process without having a definitive removal version. This will at least encourage users to look towards using the other watchers instead of FastChildWatcher. I can start working on a PR. |
This should probably be labeled as 3.12 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: