You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:02:28 UTC

[jira] [Updated] (SPARK-20060) Support Standalone visiting secured HDFS

     [ https://issues.apache.org/jira/browse/SPARK-20060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-20060:
---------------------------------
    Labels: bulk-closed  (was: )

> Support Standalone visiting secured HDFS 
> -----------------------------------------
>
>                 Key: SPARK-20060
>                 URL: https://issues.apache.org/jira/browse/SPARK-20060
>             Project: Spark
>          Issue Type: New Feature
>          Components: Deploy, Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Kent Yao
>            Priority: Major
>              Labels: bulk-closed
>
> h1. Brief design
> h2. Introductions
> The basic issue for Standalone mode to visit kerberos secured HDFS or other kerberized Services is how to gather the delegated tokens on the driver side and deliver them to the executor side. 
> When we run Spark on Yarn, we set the tokens to the container launch context to deliver them automatically and for long-term running issue caused by token expiration, we have it fixed with SPARK-14743 by writing the tokens to HDFS and updating the credential file and renewing them over and over.  
> When run Spark On Standalone, we currently have no implementations like Yarn to get and deliver those tokens.
> h2. Implementations
> Firstly, we simply move the implementation of SPARK-14743 which is only for yarn to core module. And we use it to gather the credentials we need, and also we use it to update and renew with credential files on HDFS.
> Secondly, credential files on secured HDFS are reachable for executors before they get the tokens. Here we add a sequence configuration `spark.deploy.credential. entities` which is used by the driver to put `token.encodeToUrlString()` before launching the executors, and used by the executors to fetch the credential as a string sequence during fetching the driver side spark properties, and then decode them to tokens.  Before setting up the `CoarseGrainedExecutorBackend` we set the credentials to current executor side ugi. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org