You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2022/10/12 15:37:00 UTC
[jira] [Commented] (ARROW-10676) [Python] pickle error occurs when using pyarrow._plasma.PlasmaClient with multiprocess on mac (python3.8.5)
[ https://issues.apache.org/jira/browse/ARROW-10676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616526#comment-17616526 ]
Joris Van den Bossche commented on ARROW-10676:
-----------------------------------------------
Not an answer to you actual issue, but note that Plasma is deprecated (see ARROW-17860 and the email thread that is linked there)
> [Python] pickle error occurs when using pyarrow._plasma.PlasmaClient with multiprocess on mac (python3.8.5)
> -----------------------------------------------------------------------------------------------------------
>
> Key: ARROW-10676
> URL: https://issues.apache.org/jira/browse/ARROW-10676
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++ - Plasma, Python
> Affects Versions: 0.17.1, 2.0.0
> Environment: OS: mac os catalina 10.15.6;
> Python version: python3.8.5;
> Reporter: BinbinLiang
> Priority: Major
>
> * *The environment is:*
> ** OS: mac os catalina 10.15.6;
> ** {color:#ff0000}Python version: python3.8.5;{color}
> * {color:#ff0000}*It's ok, when use python3.7(3.7.2 or 3.7.3) or python3.6(3.6.8).{color}*
> The error occurs only in python3.8(3.8.5).
> * *When I use 'pyarrow._plasma.PlasmaClient' in the 'multiprocessing.context.Process', the error occurs.*
> *
> ** First, I have a subclass of 'multiprocessing.context.Process', named Executor;
> The defined code like this: 'class Executor(Process):' .
> ** And then, I create a 'PlasmaClient' object in the '__init__' function of 'class Executor' .
> ** Finally, I new a 'Executor' object, and call the 'start()' function.
> * *The code is:*
> {code:java}
> // code placeholder
> from multiprocessing.context import Process
> from time import sleep
> from pyarrow import plasma
> class Executor(Process):
> def __init__(self):
> super().__init__()
> self._plasma_client = plasma.connect("/tmp/plasma")
> if __name__ == "__main__":
> executor = Executor()
> executor.start()
> sleep(10)
> {code}
> * *The detail informations of traceback are as follows:*
> {code:java}
> // code placeholder
> Traceback (most recent call last):
> File "/Users/liangbinbin/PycharmProjects/waimai_data_cubeinsight/python/taf/tests/plasma_client_with_multiprocessing_test.py", line 15, in <module>
> executor.start()
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/process.py", line 121, in start
> self._popen = self._Popen(self)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
> return _default_context.get_context().Process._Popen(process_obj)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
> return Popen(process_obj)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
> super().__init__(process_obj)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
> self._launch(process_obj)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
> reduction.dump(process_obj, fp)
> File "/Users/liangbinbin/Applications/anaconda3/envs/python3_8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
> ForkingPickler(file, protocol).dump(obj)
> File "stringsource", line 2, in pyarrow._plasma.PlasmaClient.__reduce_cython__
> TypeError: no default __reduce__ due to non-trivial __cinit__
> {code}
> * And the error does not occur, when I change 'multiprocessing.context.Process' to '{color:#de350b}multiprocess.context.Process{color}' which use *dill* to serialize.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)