Skip to content

Supporting out-of-band buffers (pickle protocol 5) in multiprocessing #89467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jakirkham mannequin opened this issue Sep 27, 2021 · 2 comments
Open

Supporting out-of-band buffers (pickle protocol 5) in multiprocessing #89467

jakirkham mannequin opened this issue Sep 27, 2021 · 2 comments
Labels
3.13 bugs and security fixes performance Performance or resource usage stdlib Python modules in the Lib dir topic-IO topic-multiprocessing type-feature A feature request or enhancement

Comments

@jakirkham
Copy link
Mannequin

jakirkham mannequin commented Sep 27, 2021

BPO 45304
Nosy @jakirkham

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-09-27.19:04:19.301>
labels = ['3.8', '3.9', 'expert-IO', 'performance', '3.11', 'library', '3.10']
title = 'Supporting out-of-band buffers (pickle protocol 5) in multiprocessing'
updated_at = <Date 2021-09-27.19:04:19.301>
user = 'https://github.com/jakirkham'

bugs.python.org fields:

activity = <Date 2021-09-27.19:04:19.301>
actor = 'jakirkham'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)', 'IO']
creation = <Date 2021-09-27.19:04:19.301>
creator = 'jakirkham'
dependencies = []
files = []
hgrepos = []
issue_num = 45304
keywords = []
message_count = 1.0
messages = ['402736']
nosy_count = 1.0
nosy_names = ['jakirkham']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue45304'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

@jakirkham
Copy link
Mannequin Author

jakirkham mannequin commented Sep 27, 2021

In Python 3.8+, pickle protocol 5 ( PEP<574> ) was added, which supports out-of-band buffer collection[1]. The idea being that when pickling an object with a large amount of data attached to it (like an array, dataframe, etc.) one could collect this large amount of data alongside the normal pickled data without causing a copy. This is important in particular when serializing data for communication between two python instances. IOW this is quite valuable when using a multiprocessing.pool.Pool[2] or a concurrent.futures.ProcessPoolExecutor[3]. However AFAICT neither of these leverage this functionality[4][5]. To ensure zero-copy processing of large data, it would be helpful for pickle protocol 5 to be used in both of these pools.

[1] https://docs.python.org/3/library/pickle.html#pickle-oob
[2] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool
[3] https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor
[4]

obj = _ForkingPickler.dumps(obj)

[5]
obj = _ForkingPickler.dumps(obj)

@jakirkham jakirkham mannequin added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir topic-IO performance Performance or resource usage labels Sep 27, 2021
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added type-feature A feature request or enhancement 3.12 only security fixes and removed 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes 3.8 (EOL) end of life labels Sep 11, 2022
@iritkatriel
Copy link
Member

See also #84895.

@erlend-aasland erlend-aasland added 3.13 bugs and security fixes and removed 3.12 only security fixes labels Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes performance Performance or resource usage stdlib Python modules in the Lib dir topic-IO topic-multiprocessing type-feature A feature request or enhancement
Projects
Status: No status
Status: No status
Development

No branches or pull requests

3 participants