You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Apache Arrow JIRA Bot (Jira)" <ji...@apache.org> on 2022/12/20 17:54:00 UTC
[jira] [Assigned] (ARROW-17319) [Python] pyarrow seems to set default CPU affinity to 0 on shutdown, crashes if CPU 0 is not available
[ https://issues.apache.org/jira/browse/ARROW-17319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Arrow JIRA Bot reassigned ARROW-17319:
---------------------------------------------
Assignee: (was: Mike Gevaert)
> [Python] pyarrow seems to set default CPU affinity to 0 on shutdown, crashes if CPU 0 is not available
> ------------------------------------------------------------------------------------------------------
>
> Key: ARROW-17319
> URL: https://issues.apache.org/jira/browse/ARROW-17319
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 9.0.0
> Environment: Ubuntu 20.02 / Python 3.8.10 (default, Jun 22 2022, 20:18:18)
> $ pip list
> Package Version
> --------------- -------
> numpy 1.23.1
> pandas 1.4.3
> pip 20.0.2
> pkg-resources 0.0.0
> pyarrow 9.0.0
> python-dateutil 2.8.2
> pytz 2022.1
> setuptools 44.0.0
> six 1.16.0
> Reporter: Mike Gevaert
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> I get the following traceback when exiting python after loading {{pyarrow.parquet}}
> {code}
> Python 3.8.10 (default, Jun 22 2022, 20:18:18)
> [GCC 9.4.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> os.getpid()
> 25106
> >>> import pyarrow.parquet
> >>>
> Fatal error condition occurred in /opt/vcpkg/buildtrees/aws-c-io/src/9e6648842a-364b708815.clean/source/event_loop.c:72: aws_thread_launch(&cleanup_thread, s_event_loop_destroy_async_thread_fn, el_group, &thread_options) == AWS_OP_SUCCESS
> Exiting Application
> ################################################################################
> Stack trace:
> ################################################################################
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200af06) [0x7f831b2b3f06]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x20028e5) [0x7f831b2ab8e5]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f27e09) [0x7f831b1d0e09]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) [0x7f831b2b4a3d]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1f25948) [0x7f831b1ce948]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x200ba3d) [0x7f831b2b4a3d]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x1ee0b46) [0x7f831b189b46]
> /tmp/venv/lib/python3.8/site-packages/pyarrow/libarrow.so.900(+0x194546a) [0x7f831abee46a]
> /lib/x86_64-linux-gnu/libc.so.6(+0x468a7) [0x7f831c6188a7]
> /lib/x86_64-linux-gnu/libc.so.6(on_exit+0) [0x7f831c618a60]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfa) [0x7f831c5f608a]
> {code}
> To replicate this; one needs to make sure that CPU 0 isn't available to schedule tasks on. In HPC our environment, that happens due to slurm using cgroups to constrain CPU usage.
> On a linux workstation, one should be able to:
> 1) open python as a normal user
> 2) get the pid
> 3) as root:
> {code}
> cd /sys/fs/cgroup/cpuset/
> mkdir pyarrow
> cd pyarrow
> echo 0 > cpuset.mems
> echo 1 > cpuset.cpus # sets the cgroup to only have access to cpu 1
> echo $PID > tasks
> {code}
> Then, in the python enviroment:
> {code}
> import pyarrow.parquet
> exit()
> {code}
> Which should trigger the crash.
> Sadly, I couldn't track down which {{aws-c-common}} and {{aws-c-io}} are being used for the 9.0.0 py38 manylinux wheels. (libarrow.so.900 has BuildID[sha1]=dd6c5a2efd5cacf09657780a58c40f7c930e4df1)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)