You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Sukesh Pabolu (Jira)" <ji...@apache.org> on 2021/04/15 12:43:00 UTC
[jira] [Created] (ARROW-12399) Unable to load libhdfs
Sukesh Pabolu created ARROW-12399:
-------------------------------------
Summary: Unable to load libhdfs
Key: ARROW-12399
URL: https://issues.apache.org/jira/browse/ARROW-12399
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 3.0.0
Reporter: Sukesh Pabolu
Fix For: 3.0.0
I am using pyarrow 3.0.0 with python 3.7. Facing this following error.
I am using pyspark 3.1.1. I am not able to save dataframe to hdfs. When I used pyspark 3.0.0 I was able to save dataframe hdfs.
*please help:*
*import pyarrow as pa*
*fs = pa.hdfs.connect(host='localhost', port=9001)*
__main__:1: DeprecationWarning: pyarrow.hdfs.connect is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 219, in connect
extra_conf=extra_conf
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 229, in _connect
extra_conf=extra_conf)
File "C:\Users\1570513\Anaconda3\envs\on-premise-latest\lib\site-packages\pyarrow\hdfs.py", line 45, in __init__
self._connect(host, port, user, kerb_ticket, extra_conf)
File "pyarrow\io-hdfs.pxi", line 75, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow\error.pxi", line 99, in pyarrow.lib.check_status
OSError: Unable to load libhdfs: The specified module could not be found.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)