You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Marcelo Vanzin (JIRA)" <ji...@apache.org> on 2016/01/13 20:36:39 UTC

[jira] [Comment Edited] (SPARK-12646) Support _HOST in kerberos principal for connecting to secure cluster

    [ https://issues.apache.org/jira/browse/SPARK-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092460#comment-15092460 ] 

Marcelo Vanzin edited comment on SPARK-12646 at 1/13/16 7:36 PM:
-----------------------------------------------------------------

Can you convince people to at least use proper credentials to launch the Spark jobs instead of reusing YARN's?

I'm a little wary of adding this feature just to support a broken use case. When running on YARN, Spark is a user application, and you're asking for Spark to authenticate using service principals. That's kinda wrong, even if it works.

Your code also has a huge problem in that it uses {{InetAddress.getLocalHost}}; even if this were a desirable feature, there's no guarantee that's the correct host to use at all. On multi-homed machines, for example, which should be the address to use when expanding the principal template?

You application can also login to kerberos before launching the Spark job; call kinit by yourself and then launch Spark without using {{--principal}} nor {{--keytab}}. Then Spark doesn't need to do anything, it just inherits the kerberos ticket from your app.


was (Author: vanzin):
Can you convince people to at least use proper credentials to launch the Spark jobs instead of reusing YARN's?

I'm a little wary of adding this feature just to support a broken use case. When running on YARN, Spark is a user application, and you're asking for Spark to authenticate using service principals. That's kinda wrong, even if it works.

Your code also has a huge problem in that it uses {{InetAddress.getLocalHost}}; even if this were a desirable feature, there's no guarantee that's the correct host to use at all. On multi-homed machines, for example, which should be the address to use when expanding the principal template?

You application can also login to kerberos before launching the Spark job; call kinit by yourself and then launch Spark without using "--principal" nor "--keytab". Then Spark doesn't need to do anything, it just inherits the kerberos ticket from your app.

> Support _HOST in kerberos principal for connecting to secure cluster
> --------------------------------------------------------------------
>
>                 Key: SPARK-12646
>                 URL: https://issues.apache.org/jira/browse/SPARK-12646
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>            Reporter: Hari Krishna Dara
>            Priority: Minor
>              Labels: security
>
> Hadoop supports _HOST as a token that is dynamically replaced with the actual hostname at the time the kerberos authentication is done. This is supported in many hadoop stacks including YARN. When configuring Spark to connect to secure cluster (e.g., yarn-cluster or yarn-client as master), it would be natural to extend support for this token to Spark as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org