Skip to content

bpo-43142: Do not add duplicate FDs to list in duplicate_for_child() #24461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

imaginary-person
Copy link

@imaginary-person imaginary-person commented Feb 6, 2021

bpo-43142: On Linux, while spawning another process, if a process that has subclassed Python's multiprocessing module calls multiprocessing.reduction.DupFd() (in /Lib/multiprocessing/reduction.py) multiple times on the same File descriptor by mistake, then it ends up invoking duplicate_for_child(fd)) of the class Popen, which, in turn, adds a duplicate FD to a list.
This list is then used in spawnv_passfds() in /Lib/multiprocessing/util.py, which uses that list as an argument in a call of _posixsubprocess.fork_exec().

In Modules/_posixsubprocess.c, checks exist to ensure that all the FDs in the list are unique, positive, and in a sorted order.
If duplicate entries are found, a ValueError: bad value(s) in fds_to_keep exception is raised, and the execution is terminated.
Even if this exception were somehow handled, a developer can't change the aforementioned FD list in Python without breaking modularity. Moreover, the multiprocessing library should be robust enough to not pass duplicate FDs while calling _posixsubprocess.fork_exec().

It's better to let developers get away with calling multiprocessing.reduction.DupFd() (in /Lib/multiprocessing/reduction.py) multiple times on some file descriptor by not adding duplicate entries to the list that's passed as a parameter in the call to _posixsubprocess.fork_exec(), so that spawning another process would be successful.
Developers can then handle any other situations related to using FDs at their end, because the reason they would call multiprocessing.reduction.DupFd() multiple times on the same FD would be related to their particular use-case, and they can use a workaround for it.

https://bugs.python.org/issue43142

While spawning, if the code built on top of python's multiprocessing module calls DupFd (in  /Lib/multiprocessing/reduction.py) multiple times on the same File descriptor by mistake, then it ends up invoking duplicate_for_child(fd)) of the class Popen, which, in turn, adds a duplicate FD to a list.
This list is then used in spawnv_passfds() in multiprocessing/util.py, which uses that list as an argument in a call of _posixsubprocess.fork_exec().

In Modules/_posixsubprocess.c, checks exist to ensure that all the FDs in the list are unique, positive, and in a sorted order.
If duplicate entries are found, a 'ValueError: bad value(s) in fds_to_keep' exception is raised, and the execution is terminated because this exception is not handled by Python.
Even if it were, a developer can't change the aforementioned FD list in Python without breaking modularity.

It's better to let developers get away with calling DupFd (in  /Lib/multiprocessing/reduction.py) by not accepting duplicate entries in the first place,
so that spawning a process can be successful.
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@imaginary-person

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

@imaginary-person imaginary-person changed the title Do not add duplicate FDs in duplicate_for_child() bpo-43142: Do not add duplicate FDs to list in duplicate_for_child() Feb 6, 2021
blurb-it bot and others added 2 commits February 6, 2021 06:43
All tests except the news item addition had passed the first time around, so need to re-run CI.
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Mar 12, 2021
Copy link
Contributor

@MaxwellDupre MaxwellDupre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

405 tests OK.
1 test failed:
test_ssl
Looks ok.

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 3, 2022
@NightMachinery
Copy link

Any updates on this? I am getting an error that this PR seems to fix:

ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 for model_mode in tqdm(metrics_model_modes):

File ~/mambaforge/lib/python3.10/site-packages/tqdm/std.py:665, in tqdm.__new__(cls, *_, **__)
    663 def __new__(cls, *_, **__):
    664     instance = object.__new__(cls)
--> 665     with cls.get_lock():  # also constructs lock if non-existent
    666         cls._instances.add(instance)
    667         # create monitoring thread

File ~/mambaforge/lib/python3.10/site-packages/tqdm/std.py:764, in tqdm.get_lock(cls)
    762 """Get the global lock. Construct it if it does not exist."""
    763 if not hasattr(cls, '_lock'):
--> 764     cls._lock = TqdmDefaultWriteLock()
    765 return cls._lock

File ~/mambaforge/lib/python3.10/site-packages/tqdm/std.py:97, in TqdmDefaultWriteLock.__init__(self)
     95 if root_lock is not None:
     96     root_lock.acquire()
---> 97 cls.create_mp_lock()
     98 self.locks = [lk for lk in [cls.mp_lock, cls.th_lock] if lk is not None]
     99 if root_lock is not None:

File ~/mambaforge/lib/python3.10/site-packages/tqdm/std.py:121, in TqdmDefaultWriteLock.create_mp_lock(cls)
    119 try:
    120     from multiprocessing import RLock
--> 121     cls.mp_lock = RLock()
    122 except (ImportError, OSError):  # pragma: no cover
    123     cls.mp_lock = None

File ~/mambaforge/lib/python3.10/multiprocessing/context.py:73, in BaseContext.RLock(self)
     71 '''Returns a recursive lock object'''
     72 from .synchronize import RLock
---> 73 return RLock(ctx=self.get_context())

File ~/mambaforge/lib/python3.10/multiprocessing/synchronize.py:187, in RLock.__init__(self, ctx)
    186 def __init__(self, *, ctx):
--> 187     SemLock.__init__(self, RECURSIVE_MUTEX, 1, 1, ctx=ctx)

File ~/mambaforge/lib/python3.10/multiprocessing/synchronize.py:80, in SemLock.__init__(self, kind, value, maxvalue, ctx)
     75 if self._semlock.name is not None:
     76     # We only get here if we are on Unix with forking
     77     # disabled.  When the object is garbage collected or the
     78     # process shuts down we unlink the semaphore name
     79     from .resource_tracker import register
---> 80     register(self._semlock.name, "semaphore")
     81     util.Finalize(self, SemLock._cleanup, (self._semlock.name,),
     82                   exitpriority=0)

File ~/mambaforge/lib/python3.10/multiprocessing/resource_tracker.py:155, in ResourceTracker.register(self, name, rtype)
    153 def register(self, name, rtype):
    154     '''Register name of resource with resource tracker.'''
--> 155     self._send('REGISTER', name, rtype)

File ~/mambaforge/lib/python3.10/multiprocessing/resource_tracker.py:162, in ResourceTracker._send(self, cmd, name, rtype)
    161 def _send(self, cmd, name, rtype):
--> 162     self.ensure_running()
    163     msg = '{0}:{1}:{2}\n'.format(cmd, name, rtype).encode('ascii')
    164     if len(name) > 512:
    165         # posix guarantees that writes to a pipe of less than PIPE_BUF
    166         # bytes are atomic, and that PIPE_BUF >= 512

File ~/mambaforge/lib/python3.10/multiprocessing/resource_tracker.py:129, in ResourceTracker.ensure_running(self)
    127     if _HAVE_SIGMASK:
    128         signal.pthread_sigmask(signal.SIG_BLOCK, _IGNORED_SIGNALS)
--> 129     pid = util.spawnv_passfds(exe, args, fds_to_pass)
    130 finally:
    131     if _HAVE_SIGMASK:

File ~/mambaforge/lib/python3.10/multiprocessing/util.py:452, in spawnv_passfds(path, args, passfds)
    450 errpipe_read, errpipe_write = os.pipe()
    451 try:
--> 452     return _posixsubprocess.fork_exec(
    453         args, [os.fsencode(path)], True, passfds, None, None,
    454         -1, -1, -1, -1, -1, -1, errpipe_read, errpipe_write,
    455         False, False, None, None, None, -1, None)
    456 finally:
    457     os.close(errpipe_read)

ValueError: bad value(s) in fds_to_keep

@python-cla-bot
Copy link

The following commit authors need to sign the Contributor License Agreement:

CLA signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants