You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/07/26 14:58:00 UTC
[jira] [Updated] (ARROW-6044) [Python] Pyarrow HDFS client gets
hung after a while
[ https://issues.apache.org/jira/browse/ARROW-6044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney updated ARROW-6044:
--------------------------------
Summary: [Python] Pyarrow HDFS client gets hung after a while (was: Pyarrow HDFS client gets hung after a while)
> [Python] Pyarrow HDFS client gets hung after a while
> ----------------------------------------------------
>
> Key: ARROW-6044
> URL: https://issues.apache.org/jira/browse/ARROW-6044
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.13.0
> Environment: hadoop-3.0.3
> driver='libhdfs'
> python 3.6
> Centos7
> Reporter: Fred Tzeng
> Priority: Major
>
> I'm using the pyarrow HDFS client in a long running (forever) app that makes connections to HDFS as external requests come in and destroys the connection as soon as the request is handled. This happens a large amount of times on separate threads and everything works great.
> The problem is, after the app idles for a while (perhaps hours) and no HDFS connections are made during this time, when the next connection is attempted, the API hdfs.connect(...) just hangs. No exceptions are thrown.
> Code snippet on what i'm doing to instantiate each connection:
> ...
> hdfs = pyarrow.hdfs.connect(self.hdfs_authority, self.hdfs_port, user=self.hdfs_user)
> try:
> //Do something
> finally:
> hdfs.close
>
> Any help on what might be causing these hangs is appreciated
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)