You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "James Porritt (JIRA)" <ji...@apache.org> on 2017/09/01 10:08:01 UTC
[jira] [Created] (ARROW-1445) Python: Segfault when using libhdfs3
in pyarrow using latest API
James Porritt created ARROW-1445:
------------------------------------
Summary: Python: Segfault when using libhdfs3 in pyarrow using latest API
Key: ARROW-1445
URL: https://issues.apache.org/jira/browse/ARROW-1445
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.6.0
Reporter: James Porritt
I'm encoutering a segfault when using libhdfs3 with pyarrow.
My script is:
{code}
import pyarrow
def main():
hdfs = pyarrow.hdfs.connect("<host>", <port>, "<username>", driver='libhdfs')
print hdfs.ls('<my path>')
hdfs3a = pyarrow.HdfsClient("<host>", <port>, "<username>", driver='libhdfs3')
print hdfs3a.ls('<my path>')
hdfs3b = pyarrow.hdfs.connect("<host>", <port>, "<username>", driver='libhdfs3')
print hdfs3b.ls('<my path>')
main()
{code}
The first two hdfs connections yield the correct list. The third yields:
{noformat}
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f69c0c8b57f, pid=88070, tid=140092200666880
#
# JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libc.so.6+0x13357f] __strlen_sse42+0xf
{noformat}
It dumps an error report file too.
I created my conda environment with:
{noformat}
conda create -n parquet
source activate parquet
conda install pyarrow libhdfs3 -c conda-forge
{noformat}
The packages used are:
{noformat}
arrow-cpp 0.6.0 np113py27_1 conda-forge
boost-cpp 1.64.0 1 conda-forge
bzip2 1.0.6 1 conda-forge
ca-certificates 2017.7.27.1 0 conda-forge
certifi 2017.7.27.1 py27_0 conda-forge
curl 7.54.1 0 conda-forge
icu 58.1 1 conda-forge
krb5 1.14.2 0 conda-forge
libgcrypt 1.8.0 0 conda-forge
libgpg-error 1.27 0 conda-forge
libgsasl 1.8.0 1 conda-forge
libhdfs3 2.3 0 conda-forge
libiconv 1.14 4 conda-forge
libntlm 1.4 0 conda-forge
libssh2 1.8.0 1 conda-forge
libuuid 1.0.3 1 conda-forge
libxml2 2.9.4 4 conda-forge
mkl 2017.0.3 0
ncurses 5.9 10 conda-forge
numpy 1.13.1 py27_0
openssl 1.0.2l 0 conda-forge
pandas 0.20.3 py27_1 conda-forge
parquet-cpp 1.3.0.pre 1 conda-forge
pip 9.0.1 py27_0 conda-forge
protobuf 3.3.2 py27_0 conda-forge
pyarrow 0.6.0 np113py27_1 conda-forge
python 2.7.13 1 conda-forge
python-dateutil 2.6.1 py27_0 conda-forge
pytz 2017.2 py27_0 conda-forge
readline 6.2 0 conda-forge
setuptools 36.2.2 py27_0 conda-forge
six 1.10.0 py27_1 conda-forge
sqlite 3.13.0 1 conda-forge
tk 8.5.19 2 conda-forge
wheel 0.29.0 py27_0 conda-forge
xz 5.2.3 0 conda-forge
zlib 1.2.11 0 conda-forge
{noformat}
I've set my ARROW_LIBHDFS_DIR to point at the location of the libhdfs3.so file.
I've populated my CLASSPATH as per the documentation.
Please advise.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)