You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by em...@apache.org on 2019/04/25 20:07:52 UTC

[arrow] branch master updated: ARROW-5049: [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark

This is an automated email from the ASF dual-hosted git repository.

emkornfield pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/master by this push:
     new 3f58a14  ARROW-5049: [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
3f58a14 is described below

commit 3f58a14714ccae93ae055f9ba7e6d59b8e3746a1
Author: tiger <ch...@gmail.com>
AuthorDate: Thu Apr 25 13:06:14 2019 -0700

    ARROW-5049: [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
    
    Ensure hadoop-common-{version} jar is  in the classpath
    
    Author: tiger <ch...@gmail.com>
    
    Closes #4081 from chenfj068/master and squashes the following commits:
    
    428827bf <tiger> ARROW-5049:  org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
---
 python/pyarrow/hdfs.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/python/pyarrow/hdfs.py b/python/pyarrow/hdfs.py
index 9e12675..3ddd3cd 100644
--- a/python/pyarrow/hdfs.py
+++ b/python/pyarrow/hdfs.py
@@ -123,7 +123,9 @@ class HadoopFileSystem(lib.HadoopFileSystem, FileSystem):
 
 
 def _maybe_set_hadoop_classpath():
-    if 'hadoop' in os.environ.get('CLASSPATH', ''):
+    import re
+
+    if re.search(r'hadoop-common[^/]+.jar', os.environ.get('CLASSPATH', '')):
         return
 
     if 'HADOOP_HOME' in os.environ: