You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/12/08 00:28:00 UTC

[jira] [Commented] (BEAM-3099) Implement HDFS FileSystem for Python SDK

    [ https://issues.apache.org/jira/browse/BEAM-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16282803#comment-16282803 ] 

ASF GitHub Bot commented on BEAM-3099:
--------------------------------------

udim opened a new pull request #4233: [BEAM-3099] Initial implementation of HdFileSystem.
URL: https://github.com/apache/beam/pull/4233
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Implement HDFS FileSystem for Python SDK
> ----------------------------------------
>
>                 Key: BEAM-3099
>                 URL: https://issues.apache.org/jira/browse/BEAM-3099
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Chamikara Jayalath
>            Assignee: Udi Meiri
>
> Currently Java SDK has HDFS support but Python SDK does not. With current portability efforts other runners may soon be able to use Python SDK. Having HDFS support will allow these runners to execute large scale jobs without using GCS. 
> Following suggests some libraries that can be used to connect to HDFS from Python.
> http://wesmckinney.com/blog/python-hdfs-interfaces/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)