You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saisai Shao (JIRA)" <ji...@apache.org> on 2016/07/05 01:59:11 UTC

[jira] [Updated] (SPARK-16342) Add a new Configurable Token Manager for Spark Running on YARN

     [ https://issues.apache.org/jira/browse/SPARK-16342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Saisai Shao updated SPARK-16342:
--------------------------------
    Description: 
Current Spark on YARN token management has some problems:

1. Supported service is hard-coded, only HDFS, Hive and HBase are supported for token fetching. For other third-party services which need to be communicated with Spark in Kerberos way, currently the only way is to modify Spark code.
2. Current token renewal and update mechanism is also hard-coded, which means other third-party services cannot be benefited from this system and will be failed when token is expired.
3. Also In the code level, current token obtain and update codes are placed in several different places without elegant structured, which makes it hard to maintain and extend.

So here propose a new Configurable Token Manager class to solve the issues mentioned above. 

Basically this new proposal will have two changes:

1. Abstract a ServiceTokenProvider for different services, this is configurable and pluggable, by default there will be hdfs, hbase, hive service, also user could add their own services through configuration. This interface offers a way to retrieve the tokens and token renewal interval.

2. Provide a ConfigurableTokenManager to manage all the added-in token providers, also expose APIs for external modules to get and update tokens.

Details are in the design doc (https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing), any suggestion and comment is greatly appreciated.

  was:
Current Spark on YARN token management has some problems:

1. Supported service is hard-coded, only HDFS, Hive and HBase are supported for token fetching. For other third-party services which need to be communicated with Spark in Kerberos way, currently the only way is to modify Spark code.
2. Current token renewal and update mechanism is also hard-coded, which means other third-party services cannot be benefited from this system and will be failed when token is expired.
3. Also In the code level, current token obtain and update codes are placed in several different places without elegant structured, which makes it hard to maintain and extend.

So here propose a new Configurable Token Manager class to solve the issues mentioned above. Design doc is attached with link (https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing), any suggestion and comment is greatly appreciated.


> Add a new Configurable Token Manager  for Spark Running on YARN
> ---------------------------------------------------------------
>
>                 Key: SPARK-16342
>                 URL: https://issues.apache.org/jira/browse/SPARK-16342
>             Project: Spark
>          Issue Type: New Feature
>          Components: YARN
>            Reporter: Saisai Shao
>
> Current Spark on YARN token management has some problems:
> 1. Supported service is hard-coded, only HDFS, Hive and HBase are supported for token fetching. For other third-party services which need to be communicated with Spark in Kerberos way, currently the only way is to modify Spark code.
> 2. Current token renewal and update mechanism is also hard-coded, which means other third-party services cannot be benefited from this system and will be failed when token is expired.
> 3. Also In the code level, current token obtain and update codes are placed in several different places without elegant structured, which makes it hard to maintain and extend.
> So here propose a new Configurable Token Manager class to solve the issues mentioned above. 
> Basically this new proposal will have two changes:
> 1. Abstract a ServiceTokenProvider for different services, this is configurable and pluggable, by default there will be hdfs, hbase, hive service, also user could add their own services through configuration. This interface offers a way to retrieve the tokens and token renewal interval.
> 2. Provide a ConfigurableTokenManager to manage all the added-in token providers, also expose APIs for external modules to get and update tokens.
> Details are in the design doc (https://docs.google.com/document/d/1piUvrQywWXiSwyZM9alN6ilrdlX9ohlNOuP4_Q3A6dc/edit?usp=sharing), any suggestion and comment is greatly appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org