You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/07/26 14:58:00 UTC

[jira] [Updated] (ARROW-6044) [Python] Pyarrow HDFS client gets hung after a while

     [ https://issues.apache.org/jira/browse/ARROW-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney updated ARROW-6044:
--------------------------------
    Summary: [Python] Pyarrow HDFS client gets hung after a while  (was: Pyarrow HDFS client gets hung after a while)

> [Python] Pyarrow HDFS client gets hung after a while
> ----------------------------------------------------
>
>                 Key: ARROW-6044
>                 URL: https://issues.apache.org/jira/browse/ARROW-6044
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.13.0
>         Environment: hadoop-3.0.3
> driver='libhdfs'
> python 3.6
> Centos7
>            Reporter: Fred Tzeng
>            Priority: Major
>
> I'm using the pyarrow HDFS client in a long running (forever) app that makes connections to HDFS as external requests come in and destroys the connection as soon as the request is handled. This happens a large amount of times on separate threads and everything works great.
> The problem is, after the app idles for a while (perhaps hours) and no HDFS connections are made during this time, when the next connection is attempted, the API hdfs.connect(...) just hangs. No exceptions are thrown.
> Code snippet on what i'm doing to instantiate each connection:
> ...
> hdfs = pyarrow.hdfs.connect(self.hdfs_authority, self.hdfs_port, user=self.hdfs_user)
> try:
> //Do something
> finally:
> hdfs.close
>  
> Any help on what might be causing these hangs is appreciated
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)