You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2018/12/09 14:55:00 UTC
[jira] [Updated] (ARROW-3957) [Python] pyarrow.hdfs.connect fails
silently
[ https://issues.apache.org/jira/browse/ARROW-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-3957:
--------------------------------
Summary: [Python] pyarrow.hdfs.connect fails silently (was: pyarrow.hdfs.connect fails silently)
> [Python] pyarrow.hdfs.connect fails silently
> --------------------------------------------
>
> Key: ARROW-3957
> URL: https://issues.apache.org/jira/browse/ARROW-3957
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.11.1
> Environment: centos 7
> Reporter: Jim Fulton
> Priority: Major
> Labels: hdfs
>
> I'm trying to connect to HDFS using libhdfs and Kerberos.
> I have JAVA_HOME and HADOOP_HOME set and {{pyarrow.hdfs.connect}} sets CLASSPATH correctly.
> My connect call looks like:
> {{import pyarrow.hdfs c = pyarrow.hdfs.connect(host='MYHOST', port=42424, user='ME', kerb_ticket="/tmp/krb5cc_498970") }}
> This doesn't error but the resulting connection can't do anything. They either error like this:
> {{ArrowIOError: HDFS list directory failed, errno: 255 (Unknown error 255) }}
> Or swallow errors (e.g. {{exists}} returning {{False}}).
> Note that {{connect}} errors if the host is wrong but doesn't error if the port, user, or kerb_ticket are wrong. I have no idea how to debug this, because no useful errors.
> Note that I _can_ connect using the hdfs Python package. (Of course, that doesn't provide the API I need to read Parquet files.).
> Any help would be appreciated greatly.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)