You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/08/11 18:37:52 UTC

[GitHub] [arrow] itamarst commented on pull request #10917: ARROW-9226: [Python] Support core-site.xml default filesystem.

itamarst commented on pull request #10917:
URL: https://github.com/apache/arrow/pull/10917#issuecomment-897059588


   I have tested this with a locally configured setup, and @jmwinton will be testing it as well with a more sophisticated setup. Basic setup:
   
   Starts up a server or two: `mapred minicluster -Dnamenodes=2 -format -nnport 9030`
   
   Edit `etc/hadoop/core-site.xml` in $HADOOP_HOME so it points at these servers:
   
   ```xml
   <property>
       <name>fs.defaultFS</name>
       <value>hdfs://localhost:9030</value>
       <description>Where HDFS NameNode can be found on the network</description>
   </property>
   ```
   
   The following program should give the same results for `example.py localhost 9030` and `example.py default 0` (the latter will get the host/port from the `core-site.xml` config file we edited above):
   
   ```python
   import pyarrow.fs
   import sys
   
   hdfs_interface = pyarrow.fs.HadoopFileSystem(host=sys.argv[1], port=int(sys.argv[2]))
   print("ls 1:")
   print(hdfs_interface.get_file_info("/")
   listing = hdfs_interface.get_file_info("/")
   print("ls 2: ")
   print(listing, sep="\n")
   ```
   
   Thanks to @jwminton for figuring out the above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org