You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ira Saktor (Jira)" <ji...@apache.org> on 2021/03/09 15:55:00 UTC
[jira] [Created] (ARROW-11915) Pyarrow: non-legacy filesystem
connection doesn't work while legacy does.
Ira Saktor created ARROW-11915:
----------------------------------
Summary: Pyarrow: non-legacy filesystem connection doesn't work while legacy does.
Key: ARROW-11915
URL: https://issues.apache.org/jira/browse/ARROW-11915
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 3.0.0
Reporter: Ira Saktor
I have an issue with the
pyarrow.fs.HadoopFileSystem() that i did not encounter with the legacy version of pa.hdfs.connect()
When i do:
`filesystem = pa.fs.HadoopFileSystem('my_host', port = port, kerb_ticket = path_to_my_ticket)`
It gives me:
OSError: HDFS connection failed
But if I use the legacy version with the same parameters:
`filesystem = pa.hdfs.connect('my_host', port = port, kerb_ticket = path_to_my_ticket)`
Then the connection establishes successfully.
Moreover, if I run
import pyarrow as pa
from pyarrow import fs
filesystem = pa.hdfs.connect(host='my_host', port=0, kerb_ticket=path_to_my_ticket)
filesystem = fs.HadoopFileSystem('my_host',port = 0,kerb_ticket = path_to_my_ticket)
I am able to get the non-legacy filesystem connection to work.
Is this a known issue? Is there something obvious i might be doing wrong?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)