You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ira Saktor (Jira)" <ji...@apache.org> on 2021/03/09 15:55:00 UTC

[jira] [Created] (ARROW-11915) Pyarrow: non-legacy filesystem connection doesn't work while legacy does.

Ira Saktor created ARROW-11915:
----------------------------------

             Summary: Pyarrow: non-legacy filesystem connection doesn't work while legacy does.
                 Key: ARROW-11915
                 URL: https://issues.apache.org/jira/browse/ARROW-11915
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 3.0.0
            Reporter: Ira Saktor


I have an issue with the

pyarrow.fs.HadoopFileSystem() that i did not encounter with the legacy version of pa.hdfs.connect()

 

When i do:

 

`filesystem = pa.fs.HadoopFileSystem('my_host', port = port, kerb_ticket = path_to_my_ticket)`

It gives me:



OSError: HDFS connection failed

But if I use the legacy version with the same parameters:

 

`filesystem = pa.hdfs.connect('my_host', port = port, kerb_ticket = path_to_my_ticket)`

Then the connection establishes successfully.

Moreover, if I run

import pyarrow as pa
from pyarrow import fs
filesystem = pa.hdfs.connect(host='my_host', port=0, kerb_ticket=path_to_my_ticket)
filesystem = fs.HadoopFileSystem('my_host',port = 0,kerb_ticket = path_to_my_ticket)

I am able to get the non-legacy filesystem connection to work.

Is this a known issue? Is there something obvious i might be doing wrong?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)