You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Udi Meiri (Jira)" <ji...@apache.org> on 2021/02/04 17:50:00 UTC

[jira] [Commented] (BEAM-11750) Python HDFS: add Kerberos authentication support

    [ https://issues.apache.org/jira/browse/BEAM-11750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279007#comment-17279007 ] 

Udi Meiri commented on BEAM-11750:
----------------------------------

The above flag idea might work when using DirectRunner and having previously run kinit, etc. to set up the environment. The question is whether setup of the worker VM should be supported as well (more flags?) or left to the user (such as via a custom setup.py). In any case, this should be documented.

> Python HDFS: add Kerberos authentication support
> ------------------------------------------------
>
>                 Key: BEAM-11750
>                 URL: https://issues.apache.org/jira/browse/BEAM-11750
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-hadoop
>            Reporter: Udi Meiri
>            Priority: P2
>              Labels: starter
>
> The HDFS client used by Beam supports Kerberos.
> Initial idea: add a flag --hdfs_client that defaults to "INSECURE" and also accepts "KERBEROS". This flag will control initialization of self._hdfs_client.
> HDFS client docs:
> https://hdfscli.readthedocs.io/en/latest/api.html#module-hdfs.ext.kerberos
> The HDFS client seems to use this Kerberos library:
> https://pypi.org/project/requests-kerberos/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)