You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2016/08/10 22:41:20 UTC

[jira] [Resolved] (SPARK-14743) Improve delegation token handling in secure clusters

     [ https://issues.apache.org/jira/browse/SPARK-14743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcelo Vanzin resolved SPARK-14743.
------------------------------------
       Resolution: Fixed
         Assignee: Saisai Shao
    Fix Version/s: 2.1.0

> Improve delegation token handling in secure clusters
> ----------------------------------------------------
>
>                 Key: SPARK-14743
>                 URL: https://issues.apache.org/jira/browse/SPARK-14743
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 2.0.0
>            Reporter: Marcelo Vanzin
>            Assignee: Saisai Shao
>             Fix For: 2.1.0
>
>
> In a way, I'd consider this a parent bug of SPARK-7252.
> Spark's current support for delegation tokens is a little all over the place:
> - for HDFS, there's support for re-creating tokens if a principal and keytab are provided
> - for HBase and Hive, Spark will fetch delegation tokens so that apps can work in cluster mode, but will not re-create them, so apps that need those will stop working after 7 days
> - for anything else, Spark doesn't do anything. Lots of other services use delegation tokens, and supporting them as data sources in Spark becomes more complicated because of that. e.g., Kafka will (hopefully) soon support them.
> It would be nice if Spark had consistent support for handling delegation tokens regardless of who needs them. I'd list these as the requirements:
> - Spark to provide a generic interface for fetching delegation tokens. This would allow Spark's delegation token support to be extended using some plugin architecture (e.g. Java services), meaning Spark itself doesn't need to support every possible service out there.
> This would be used to fetch tokens when launching apps in cluster mode, and when a principal and a keytab are provided to Spark.
> - A way to manually update delegation tokens in Spark. For example, a new SparkContext API, or some configuration that tells Spark to monitor a file for changes and load tokens from said file.
> This would allow external applications to manage tokens outside of Spark and be able to update a running Spark application (think, for example, a job sever like Oozie, or something like Hive-on-Spark which manages Spark apps running remotely).
> - A way to notify running code that new delegation tokens have been loaded.
> This may not be strictly necessary; it might be possible for code to detect that, e.g., by peeking into the UserGroupInformation structure. But an event sent to the listener bus would allow applications to react when new tokens are available (e.g., the Hive backend could re-create connections to the metastore server using the new tokens).
> Also, cc'ing [~busbey] and [~steve_l] since you've talked about this in the mailing list recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org