bpo-28053: Allow custom reducer when using multiprocessing#15058
bpo-28053: Allow custom reducer when using multiprocessing#15058pierreglaser wants to merge 32 commits intopython:mainfrom
Conversation
|
You will notice that for the def process(self, *args, **kwargs):
return self._Process(*args, reducer=self.get_reducer(), **kwargs)This comes from the fact that the cpython/Lib/multiprocessing/context.py Lines 220 to 224 in 7397cda For the process class, passing the class Process(process.BaseProcess):
_start_method = None
@staticmethod
def _Popen(process_obj):
if self._ctx is not None:
return self._ctx.get_context()._Process._Popen(process_obj)
return _default_context.get_context()._Process._Popen(process_obj)with So implementation-wise, it is easier to simply pass the But I am opened to suggestions. |
Lib/multiprocessing/context.py
Outdated
There was a problem hiding this comment.
The checks below aren't strict enough; they pass for a .register()ed subclass of Abstract<Reducer|Pickler|Unpickler>, but the code won't actually work with such a class due to depending on undocumented APIs.
There was a problem hiding this comment.
I do not understand what you mean by "Undocumented API". This PR comes with documentation for the AbstractReducer, AbstractPickler, and AbstractUnpickler.
There was a problem hiding this comment.
What I mean is that the documentation suggests that the below code should work:
class DummyReducer:
def get_pickler_class():
return pickle.Pickler
def get_unpickler_class():
return pickle.Unpickler
AbstractReducer.register(DummyReducer)
multiprocessing.set_reducer(DummyReducer())However, in practice, it appears that running that code (and then doing something that uses reduction) will produce a (rather cryptic) error message in some cases. Either the documentation should be clear that virtual subclasses don't work and the code should raise an error when a virtual subclass is passed, or the code should be updated to handle virtual subclasses.
There was a problem hiding this comment.
What I mean is that the documentation suggests that the below code should work
Not really: the get_pickler_class API documentation specifies that get_pickler_class must return anAbstractPickler subclass, but your DummyReducer (which itself, should be an AbstractReducer subclass) simply returns a Pickler, which is not an AbstractPickler subclass. Or maybe I misunderstood your message?
There was a problem hiding this comment.
You didn't misunderstand anything, my example was accidentally bogus, and I misunderstood the code. The actual bug I was trying to point out appears not to exist with reducers, but instead with picklers.
class DummyReducer(AbstractReducer):
def get_pickler_class():
return DummyPickler
class DummyPickler(pickle.Pickler):
pass
AbstractPickler.register(DummyPickler)
multiprocessing.set_reducer(DummyReducer())This seemingly harmless code appears to have the effect of breaking the passing of all of the custom types that multiprocessing registers, in an entirely non-intuitive way, which I think should either be fixed or documented.
|
@pitrou what could I do to ease the review process of this PR? |
|
@pierreglaser The first necessary condition is for me to find some time... sorry :-/ |
|
@pitrou no worries :) |
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
None of the attributes defined between here and __init__ appear to be used.
There was a problem hiding this comment.
Same comment as below.
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
This property does not appear to be used.
There was a problem hiding this comment.
This is work taken from @pablogsal previous commits (see #9959). I assume there was a good reason for this, but he'll probably know better than me.
Lib/multiprocessing/reduction.py
Outdated
There was a problem hiding this comment.
This method (a) doesn't appear to be used and (b) doesn't necessarily work, because nothing requires the get_pickler_class to define a register method.
There was a problem hiding this comment.
Regarding (b), this is stated in the documentation that get_pickler_class should return a subclass of an AbstractPickler, that itself (as a subclass of ForkingPickler), implements a register method. Regarding (a), same comment as above (this @pablogsal's work), but I'll think about whether this function can be removed.
|
@pablogsal friendly ping: I am curious to know what are your thoughts on this. |
| @@ -0,0 +1 @@ | |||
| Fix the implementation of custom reducers for in `multiprocessing`. | |||
There was a problem hiding this comment.
'for in' Either a word is missing or there is an extra word.
There was a problem hiding this comment.
To me, it looks like "for" is the extra word. I'd personally go with:
| Fix the implementation of custom reducers for in `multiprocessing`. | |
| Fix the implementation of custom reducers in :mod:`multiprocessing`. |
(Also adds the sphinx role :mod: to provide an inline link to the module. It's optional for Misc/NEWS.d, but we may as well include it with the typo fix. This makes it easier to move over to the "What's New" doc later.)
There was a problem hiding this comment.
Also note that `multiprocessing` will cause make check to fail: it should either be :mod:`multiprocessing` (as suggested by @aeros) or ``multiprocessing``.
Rebasing is, as evidenced by repeated failures like this, error prone. Besides the obvious noise, removing the spurious review requests does not unsubscribe people. If you were just trying to update your branch to your current repository/fork master (once synchronized), rebasing is also not needed. After checking out the branch, |
MaxwellDupre
left a comment
There was a problem hiding this comment.
Run multiprocessing tests:
Ran 369 tests in 215.720s
OK (skipped=35)
Ran 39 tests in 22.709s
OK
Ran 369 tests in 85.400s
OK (skipped=39)
Ran 369 tests in 99.032s
OK (skipped=32)
Look ok to me.
| Gets the current reduction class in use by the module's primitive internals. | ||
|
|
||
| For example, substituting the reducer class to use the :mod:`pickle` protocol | ||
| version 2 to be able to communicate with a Python 2.x programs.:: |
There was a problem hiding this comment.
Please change documentation and examples not to reference Python 2 as that long EOL at this point. Just say it in the abstract "communicate with processes using older Python versions" and offer some explanation as to how this is even possible by linking to the relevant APIs in multiprocessing that enable people to create such unexpected monsters. (see #67592 (comment))
Most multiprocessing users view it as a module with a single parent process controlling identical by definition Python version child processes that it launches and controls.
| .. method:: set_reducer(reduction) | ||
|
|
||
| Sets a reduction instance to be used for serialization and deserialization | ||
| by the module primitive internals. **reduction** must be an instance of a |
There was a problem hiding this comment.
| by the module primitive internals. **reduction** must be an instance of a | |
| by the module primitive internals. *reduction* must be an instance of a |
Arguments should be marked in italic: https://devguide.python.org/documenting/#rest-inline-markup
|
|
||
| .. method:: dump(obj) | ||
|
|
||
| Write a pickled representation of obj to the open file. |
There was a problem hiding this comment.
| Write a pickled representation of obj to the open file. | |
| Write a pickled representation of *obj* to the open file. |
| Base class providing access to custom ``Pickler`` and ``Unpickler`` | ||
| classes to be used by ``multiprocessing`` when serializing objects. | ||
|
|
||
| .. method:: get_pickler_class(): |
There was a problem hiding this comment.
| .. method:: get_pickler_class(): | |
| .. method:: get_pickler_class() |
| This method must return an subclass of :class:`AbstractPickler` to be used | ||
| by ``multiprocessing`` when serializing objects. | ||
|
|
||
| .. method:: get_unpickler_class(): |
There was a problem hiding this comment.
| .. method:: get_unpickler_class(): | |
| .. method:: get_unpickler_class() |
| @@ -0,0 +1 @@ | |||
| Fix the implementation of custom reducers for in `multiprocessing`. | |||
There was a problem hiding this comment.
Also note that `multiprocessing` will cause make check to fail: it should either be :mod:`multiprocessing` (as suggested by @aeros) or ``multiprocessing``.
| .. currentmodule:: multiprocessing | ||
|
|
||
| Several primitives of the :mod:`multiprocessing` module such as | ||
| :class:`multiprocessing.Queue`, :class:`multiprocessing.connection.Listener` or | ||
| :class:`multiprocessing.connection.Server` need to serialize and deserialize Python |
There was a problem hiding this comment.
Aren't either the currentmodule directive or the multiprocessing.*s in the roles redundant in this part?
This PR is a follow up on #9959. It implements the suggestions I made here #9959 (comment).
I'm making it a new PR to get the CI working now and gather separate feedback.
contexttoProcessobjectsAbstractPicklerintoAbstractPicklerandAbstractUnpicklerhttps://bugs.python.org/issue28053