You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2018/11/28 21:25:37 UTC

[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/23174

    [SPARK-26194][k8s] Auto generate auth secret for k8s apps.

    This change modifies the logic in the SecurityManager to do two
    things:
    
    - generate unique app secrets also when k8s is being used
    - only store the secret in the user's UGI on YARN
    
    The latter is needed so that k8s won't unnecessarily create
    k8s secrets for the UGI credentials when only the auth token
    is stored there.
    
    On the k8s side, the secret is propagated to executors using
    an environment variable instead. This ensures it works in both
    client and cluster mode.
    
    Security doc was updated to mention the feature and clarify that
    proper access control in k8s should be enabled for it to be secure.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-26194

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/23174.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #23174
    
----
commit 0e36a4bb4a5a1ad9abee7e003b7d5f3588cba126
Author: Marcelo Vanzin <va...@...>
Date:   2018-11-16T23:21:00Z

    [SPARK-26194][k8s] Auto generate auth secret for k8s apps.
    
    This change modifies the logic in the SecurityManager to do two
    things:
    
    - generate unique app secrets also when k8s is being used
    - only store the secret in the user's UGI on YARN
    
    The latter is needed so that k8s won't unnecessarily create
    k8s secrets for the UGI credentials when only the auth token
    is stored there.
    
    On the k8s side, the secret is propagated to executors using
    an environment variable instead. This ensures it works in both
    client and cluster mode.
    
    Security doc was updated to mention the feature and clarify that
    proper access control in k8s should be enabled for it to be secure.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Ok that's fine, with the caveat that we merge the subsequent optionality soon. I'll work on the file-based secret authentication and encryption this week. I'm very concerned that we'll ship with this but with no other security options if we're not rigorously moving SPARK-26239 forward.
    
    Merging to master in a few hours, letting it stay open for a bit for any other commentary. @gdearment for SA.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    I filed SPARK-26239.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > via a mounted file
    > Also the user should be able to specify their own mounted file
    
    The point is that the user shouldn't need to set this at all. You enable auth, Spark takes care of it. There's no point in specifying a pre-defined file with the secret - in fact that makes things *less* secure, because you'd be reusing the secret in different apps.
    
    > Some users would prefer to avoid propagating sensitive information via environment variables
    
    Why? And how are mounted files better?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    It's just to have the assurance that we will have some way to bypass this for auth at least for 3.x. I'd like to concretely determine this before merging if possible. But I hope that the suggestion proposed in SPARK-26239 could be agreed upon fairly quickly?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by gdearment <gi...@git.apache.org>.
Github user gdearment commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    The issue with requiring the use of secrets is quite a bit of work must be done in order to secure a cluster to ensure that the secrets are themselves secured. Most of the high level concerns are outlined [here](https://kubernetes.io/docs/concepts/configuration/secret/#risks), but you then need to understand the implications of this in the context of how Spark is used. Its my understanding that Spark's implementation on Kubernetes doesn’t dictate how spark jobs should be run. As an example of where this can quickly break down:
    
    - Imagine a scenario where someone runs two drivers in the same namespace, and rely on the default svc token to allow the driver to create its executors.
    - The driver token will have GET access on the secret for its driver, but also other secrets within the same namespace – such as the secret for driver B.
    - If the executors in job B can use the same service token as their driver, then they have the ability of launching executors that can use the secret from job A to connect an executor at which point you have a compromise.
     
    There are lots of things that can be done to prevent this kind of attack – run in different namespaces, don’t allow executors to have k8s permissions, etc., but it feels risky for spark to require the use of secrets while leaving it an exercise for the reader to understand how to properly run spark such that the secrets are actually secured.
    
    At Palantir, we rely heavily on an external secret storage and bootstrapping system, built around Vault, to securely pass secrets into pods. Using external systems for these kinds of things, especially those built around Vault, is a [common practice](https://www.google.com/search?q=kubernetes+secrets+with+vault). 
    
    So really, what we'd like here is to avoid coupling the API for how spark containers read secrets so that alternative implementations can be used that some might consider more secure.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    As I suggested before, any alternative method can be added later. I don't see why does it need to block this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    It matters because we're discussing direction - that is, what opinion Spark wants to take regarding how to set up security on K8s. It's not obvious from our discussion on SPARK-26239 that we agree that we should allow such optionality for other authentication schemes. In other words, if we just merge this PR without further discussion and consensus on SPARK-26239, we're effectively communicating that Spark is locked in to the authentication backed by K8s secrets. I want to emphasize that it's important to agree on the direction for the bigger picture early on, and then we say that this patch still fits into the bigger vision.
    
    I also want to intend to take this patch and check that work on SPARK-26239 would work nicely with it, but to the best of my knowledge the additional options should layer on top of this default one just fine. Would like some concrete prototyping to confirm this though.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5687/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5691/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > A proposed scheme is to have spark.authenticate.k8s.secret.provider=autok8ssecret
    
    If you're going to add a different way to get the auth secret later, then you can introduce that option with a default value. It does not mean it needs to be done *in this change*, which is my point.
    
    The only real change being introduced here form the Spark status quo is that you don't need to provide your own auth secret in the configuration (i.e. it would work like YARN), which doesn't even work in k8s today because that is not propagated to the executors in any way. And if you think that is a problem I can gate the generation of the secret based on whether one is already defined in the config.
    
    > if it's not introduced in this patch, then at least we should file JIRA tickets 
    
    That is fine with me. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99634 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99634/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Would it be possible to also provide support for passing this via a mounted file? Some users would prefer to avoid propagating sensitive information via environment variables. Also the user should be able to specify their own mounted file; spark-submit shouldn't always mount an auto-generated secret for the user.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > while leaving it an exercise for the reader to understand how to properly run spark such that the secrets are actually secured.
    
    I don't think that's an exercise for the user, but for the admin. If the admin has configured things properly, the user will be able to deploy secure applications without issue. This is not just the case in kubernetes. It's the case in any deployment.
    
    Even your example of using Vault requires that. There's work needed to use it properly and securely.
    
    > avoid coupling the API for how spark containers read secrets
    
    There doesn't need to be a single solution. This patch going in does not preclude adding more features later, one of which might be reading this from a pre-defined secret.
    
    And in your example I don't see how Vault would actually solve the client-mode driver case. There's no kubernetes involved in the driver, which may be running anywhere, and it needs to know the secret.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    I looked at the test failure, but the logs weren't super useful. This passed locally, but let me retrigger here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Ok that's fine. Will merge to master if there are no further comments in the near future.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5691/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23174#discussion_r239894356
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -87,44 +88,61 @@ private[spark] class BasicExecutorFeatureStep(kubernetesConf: KubernetesExecutor
         val executorCpuQuantity = new QuantityBuilder(false)
           .withAmount(executorCoresRequest)
           .build()
    -    val executorExtraClasspathEnv = executorExtraClasspath.map { cp =>
    -      new EnvVarBuilder()
    -        .withName(ENV_CLASSPATH)
    -        .withValue(cp)
    -        .build()
    -    }
    -    val executorExtraJavaOptionsEnv = kubernetesConf
    -      .get(EXECUTOR_JAVA_OPTIONS)
    -      .map { opts =>
    -        val subsOpts = Utils.substituteAppNExecIds(opts, kubernetesConf.appId,
    -          kubernetesConf.executorId)
    -        val delimitedOpts = Utils.splitCommandString(subsOpts)
    -        delimitedOpts.zipWithIndex.map {
    -          case (opt, index) =>
    -            new EnvVarBuilder().withName(s"$ENV_JAVA_OPT_PREFIX$index").withValue(opt).build()
    +
    +    val executorEnv: Seq[EnvVar] = {
    +        (Seq(
    +          (ENV_DRIVER_URL, driverUrl),
    +          (ENV_EXECUTOR_CORES, executorCores.toString),
    +          (ENV_EXECUTOR_MEMORY, executorMemoryString),
    +          (ENV_APPLICATION_ID, kubernetesConf.appId),
    +          // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    +          (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    +          (ENV_EXECUTOR_ID, kubernetesConf.executorId)
    +        ) ++ kubernetesConf.environment).map { case (k, v) =>
    +          new EnvVarBuilder()
    +            .withName(k)
    +            .withValue(v)
    +            .build()
             }
    -      }.getOrElse(Seq.empty[EnvVar])
    -    val executorEnv = (Seq(
    -      (ENV_DRIVER_URL, driverUrl),
    -      (ENV_EXECUTOR_CORES, executorCores.toString),
    -      (ENV_EXECUTOR_MEMORY, executorMemoryString),
    -      (ENV_APPLICATION_ID, kubernetesConf.appId),
    -      // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    -      (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    -      (ENV_EXECUTOR_ID, kubernetesConf.executorId)) ++
    -      kubernetesConf.environment)
    -      .map(env => new EnvVarBuilder()
    -        .withName(env._1)
    -        .withValue(env._2)
    -        .build()
    -      ) ++ Seq(
    -      new EnvVarBuilder()
    -        .withName(ENV_EXECUTOR_POD_IP)
    -        .withValueFrom(new EnvVarSourceBuilder()
    -          .withNewFieldRef("v1", "status.podIP")
    +      } ++ {
    +        Seq(new EnvVarBuilder()
    +          .withName(ENV_EXECUTOR_POD_IP)
    +          .withValueFrom(new EnvVarSourceBuilder()
    +            .withNewFieldRef("v1", "status.podIP")
    +            .build())
               .build())
    -        .build()
    -    ) ++ executorExtraJavaOptionsEnv ++ executorExtraClasspathEnv.toSeq
    +      } ++ {
    +        Option(secMgr.getSecretKey()).map { authSecret =>
    +          new EnvVarBuilder()
    +            .withName(SecurityManager.ENV_AUTH_SECRET)
    +            .withValue(authSecret)
    --- End diff --
    
    > Anyway, this isn't different from someone else being able to read secrets in the same namespace as the pod.
    
    It isn't in theory, but in practice my understanding is that secrets are often permissioned very differently from pod objects in the cluster. We should be optimizing for the more common use case, which will work out of the box for more users and also is more secure in the context of more common configurations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99629/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/23174


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > with the caveat that we merge the subsequent optionality soon
    
    Again, and sorry for pounding on that key, but why does that matter? It has zero effect on the feature being added here. If the code added here is not good enough for your use case, you're in the exact same situation as if this change did not go in. But for those that can leverage the auth feature as added in this change, they're in a much, much better place.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99402/testReport)** for PR 23174 at commit [`0e36a4b`](https://github.com/apache/spark/commit/0e36a4bb4a5a1ad9abee7e003b7d5f3588cba126).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5691/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99629/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5687/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99402/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    So, can we move forward with this and let any future new feature be handled in SPARK-26239?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99634 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99634/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > Why? And how are mounted files better?
    
    Environment variables leak far more easily than file contents. One can accidentally `printenv` in a shell attached to the and get the secret contents. `System.getenv` has a similar effect within the application code itself. For what it's worth I'm also not sure if the secret would be listed under the environment variables in the Spark UI (would have to test this).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23174#discussion_r239694862
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -87,44 +88,61 @@ private[spark] class BasicExecutorFeatureStep(kubernetesConf: KubernetesExecutor
         val executorCpuQuantity = new QuantityBuilder(false)
           .withAmount(executorCoresRequest)
           .build()
    -    val executorExtraClasspathEnv = executorExtraClasspath.map { cp =>
    -      new EnvVarBuilder()
    -        .withName(ENV_CLASSPATH)
    -        .withValue(cp)
    -        .build()
    -    }
    -    val executorExtraJavaOptionsEnv = kubernetesConf
    -      .get(EXECUTOR_JAVA_OPTIONS)
    -      .map { opts =>
    -        val subsOpts = Utils.substituteAppNExecIds(opts, kubernetesConf.appId,
    -          kubernetesConf.executorId)
    -        val delimitedOpts = Utils.splitCommandString(subsOpts)
    -        delimitedOpts.zipWithIndex.map {
    -          case (opt, index) =>
    -            new EnvVarBuilder().withName(s"$ENV_JAVA_OPT_PREFIX$index").withValue(opt).build()
    +
    +    val executorEnv: Seq[EnvVar] = {
    +        (Seq(
    +          (ENV_DRIVER_URL, driverUrl),
    +          (ENV_EXECUTOR_CORES, executorCores.toString),
    +          (ENV_EXECUTOR_MEMORY, executorMemoryString),
    +          (ENV_APPLICATION_ID, kubernetesConf.appId),
    +          // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    +          (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    +          (ENV_EXECUTOR_ID, kubernetesConf.executorId)
    +        ) ++ kubernetesConf.environment).map { case (k, v) =>
    +          new EnvVarBuilder()
    +            .withName(k)
    +            .withValue(v)
    +            .build()
             }
    -      }.getOrElse(Seq.empty[EnvVar])
    -    val executorEnv = (Seq(
    -      (ENV_DRIVER_URL, driverUrl),
    -      (ENV_EXECUTOR_CORES, executorCores.toString),
    -      (ENV_EXECUTOR_MEMORY, executorMemoryString),
    -      (ENV_APPLICATION_ID, kubernetesConf.appId),
    -      // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    -      (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    -      (ENV_EXECUTOR_ID, kubernetesConf.executorId)) ++
    -      kubernetesConf.environment)
    -      .map(env => new EnvVarBuilder()
    -        .withName(env._1)
    -        .withValue(env._2)
    -        .build()
    -      ) ++ Seq(
    -      new EnvVarBuilder()
    -        .withName(ENV_EXECUTOR_POD_IP)
    -        .withValueFrom(new EnvVarSourceBuilder()
    -          .withNewFieldRef("v1", "status.podIP")
    +      } ++ {
    +        Seq(new EnvVarBuilder()
    +          .withName(ENV_EXECUTOR_POD_IP)
    +          .withValueFrom(new EnvVarSourceBuilder()
    +            .withNewFieldRef("v1", "status.podIP")
    +            .build())
               .build())
    -        .build()
    -    ) ++ executorExtraJavaOptionsEnv ++ executorExtraClasspathEnv.toSeq
    +      } ++ {
    +        Option(secMgr.getSecretKey()).map { authSecret =>
    +          new EnvVarBuilder()
    +            .withName(SecurityManager.ENV_AUTH_SECRET)
    +            .withValue(authSecret)
    --- End diff --
    
    Ah I thought about this a bit more and realized that this is more insecure than I originally read it to be.
    
    If the secret is put directly in the environment variable field itself, then anyone who has permission to get the pod metadata from the Kubernetes API server can now read the secret generated by this application. In practice permissioning on pod specs is often far looser than permissioning on Kubernetes secret objects. In this solution the administrator has to restrict access to pod specs to only the user.
    
    I think at the very least we want this to be configured via creating a Kubernetes secret object, then [loading the environment variable](https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-environment-variables) to point to the secret object.
    
    In the meantime I'm going to push the PR that allows secrets to be specified as file paths directly. I will also file a Spark ticket to avoid putting the environment variable directly in the pod spec object itself.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99402 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99402/testReport)** for PR 23174 at commit [`0e36a4b`](https://github.com/apache/spark/commit/0e36a4bb4a5a1ad9abee7e003b7d5f3588cba126).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23174#discussion_r239895212
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -87,44 +88,61 @@ private[spark] class BasicExecutorFeatureStep(kubernetesConf: KubernetesExecutor
         val executorCpuQuantity = new QuantityBuilder(false)
           .withAmount(executorCoresRequest)
           .build()
    -    val executorExtraClasspathEnv = executorExtraClasspath.map { cp =>
    -      new EnvVarBuilder()
    -        .withName(ENV_CLASSPATH)
    -        .withValue(cp)
    -        .build()
    -    }
    -    val executorExtraJavaOptionsEnv = kubernetesConf
    -      .get(EXECUTOR_JAVA_OPTIONS)
    -      .map { opts =>
    -        val subsOpts = Utils.substituteAppNExecIds(opts, kubernetesConf.appId,
    -          kubernetesConf.executorId)
    -        val delimitedOpts = Utils.splitCommandString(subsOpts)
    -        delimitedOpts.zipWithIndex.map {
    -          case (opt, index) =>
    -            new EnvVarBuilder().withName(s"$ENV_JAVA_OPT_PREFIX$index").withValue(opt).build()
    +
    +    val executorEnv: Seq[EnvVar] = {
    +        (Seq(
    +          (ENV_DRIVER_URL, driverUrl),
    +          (ENV_EXECUTOR_CORES, executorCores.toString),
    +          (ENV_EXECUTOR_MEMORY, executorMemoryString),
    +          (ENV_APPLICATION_ID, kubernetesConf.appId),
    +          // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    +          (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    +          (ENV_EXECUTOR_ID, kubernetesConf.executorId)
    +        ) ++ kubernetesConf.environment).map { case (k, v) =>
    +          new EnvVarBuilder()
    +            .withName(k)
    +            .withValue(v)
    +            .build()
             }
    -      }.getOrElse(Seq.empty[EnvVar])
    -    val executorEnv = (Seq(
    -      (ENV_DRIVER_URL, driverUrl),
    -      (ENV_EXECUTOR_CORES, executorCores.toString),
    -      (ENV_EXECUTOR_MEMORY, executorMemoryString),
    -      (ENV_APPLICATION_ID, kubernetesConf.appId),
    -      // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    -      (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    -      (ENV_EXECUTOR_ID, kubernetesConf.executorId)) ++
    -      kubernetesConf.environment)
    -      .map(env => new EnvVarBuilder()
    -        .withName(env._1)
    -        .withValue(env._2)
    -        .build()
    -      ) ++ Seq(
    -      new EnvVarBuilder()
    -        .withName(ENV_EXECUTOR_POD_IP)
    -        .withValueFrom(new EnvVarSourceBuilder()
    -          .withNewFieldRef("v1", "status.podIP")
    +      } ++ {
    +        Seq(new EnvVarBuilder()
    +          .withName(ENV_EXECUTOR_POD_IP)
    +          .withValueFrom(new EnvVarSourceBuilder()
    +            .withNewFieldRef("v1", "status.podIP")
    +            .build())
               .build())
    -        .build()
    -    ) ++ executorExtraJavaOptionsEnv ++ executorExtraClasspathEnv.toSeq
    +      } ++ {
    +        Option(secMgr.getSecretKey()).map { authSecret =>
    +          new EnvVarBuilder()
    +            .withName(SecurityManager.ENV_AUTH_SECRET)
    +            .withValue(authSecret)
    --- End diff --
    
    >  the more common use case,
    
    Which is?
    
    There's a lot to think about when you give permissions like "users can view, create and delete pods". If you do that, for example, you can delete other people's pods. That is also considered a security issue, since you can DoS other users.
    
    Anyway, my point is that we should give people the choice of how they deploy things, and set up security according to their own constraints. This was just one way of doing it, and was not meant to be the only way.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    I don't understand what you want.
    
    Without this change, auth does not work, period.
    
    With this, users at least have one choice.
    
    If you want to add another choice, you're free to. But I don't see why the lack of another choice has any effect on this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test starting
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5476/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    (In fact, env variables don't even show up in the UI or event logs, as far as I can see. Other configs - Spark config, system properties, e.g. - do show up, and are redacted to mask secrets.)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    **[Test build #99629 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99629/testReport)** for PR 23174 at commit [`791b5ee`](https://github.com/apache/spark/commit/791b5ee88d2f3c856aad640f662472124e098c66).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23174#discussion_r239892984
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -87,44 +88,61 @@ private[spark] class BasicExecutorFeatureStep(kubernetesConf: KubernetesExecutor
         val executorCpuQuantity = new QuantityBuilder(false)
           .withAmount(executorCoresRequest)
           .build()
    -    val executorExtraClasspathEnv = executorExtraClasspath.map { cp =>
    -      new EnvVarBuilder()
    -        .withName(ENV_CLASSPATH)
    -        .withValue(cp)
    -        .build()
    -    }
    -    val executorExtraJavaOptionsEnv = kubernetesConf
    -      .get(EXECUTOR_JAVA_OPTIONS)
    -      .map { opts =>
    -        val subsOpts = Utils.substituteAppNExecIds(opts, kubernetesConf.appId,
    -          kubernetesConf.executorId)
    -        val delimitedOpts = Utils.splitCommandString(subsOpts)
    -        delimitedOpts.zipWithIndex.map {
    -          case (opt, index) =>
    -            new EnvVarBuilder().withName(s"$ENV_JAVA_OPT_PREFIX$index").withValue(opt).build()
    +
    +    val executorEnv: Seq[EnvVar] = {
    +        (Seq(
    +          (ENV_DRIVER_URL, driverUrl),
    +          (ENV_EXECUTOR_CORES, executorCores.toString),
    +          (ENV_EXECUTOR_MEMORY, executorMemoryString),
    +          (ENV_APPLICATION_ID, kubernetesConf.appId),
    +          // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    +          (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    +          (ENV_EXECUTOR_ID, kubernetesConf.executorId)
    +        ) ++ kubernetesConf.environment).map { case (k, v) =>
    +          new EnvVarBuilder()
    +            .withName(k)
    +            .withValue(v)
    +            .build()
             }
    -      }.getOrElse(Seq.empty[EnvVar])
    -    val executorEnv = (Seq(
    -      (ENV_DRIVER_URL, driverUrl),
    -      (ENV_EXECUTOR_CORES, executorCores.toString),
    -      (ENV_EXECUTOR_MEMORY, executorMemoryString),
    -      (ENV_APPLICATION_ID, kubernetesConf.appId),
    -      // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    -      (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    -      (ENV_EXECUTOR_ID, kubernetesConf.executorId)) ++
    -      kubernetesConf.environment)
    -      .map(env => new EnvVarBuilder()
    -        .withName(env._1)
    -        .withValue(env._2)
    -        .build()
    -      ) ++ Seq(
    -      new EnvVarBuilder()
    -        .withName(ENV_EXECUTOR_POD_IP)
    -        .withValueFrom(new EnvVarSourceBuilder()
    -          .withNewFieldRef("v1", "status.podIP")
    +      } ++ {
    +        Seq(new EnvVarBuilder()
    +          .withName(ENV_EXECUTOR_POD_IP)
    +          .withValueFrom(new EnvVarSourceBuilder()
    +            .withNewFieldRef("v1", "status.podIP")
    +            .build())
               .build())
    -        .build()
    -    ) ++ executorExtraJavaOptionsEnv ++ executorExtraClasspathEnv.toSeq
    +      } ++ {
    +        Option(secMgr.getSecretKey()).map { authSecret =>
    +          new EnvVarBuilder()
    +            .withName(SecurityManager.ENV_AUTH_SECRET)
    +            .withValue(authSecret)
    --- End diff --
    
    > If the secret is put directly in the environment variable field itself, then anyone who has permission to get the pod metadata from the Kubernetes API server can now read the secret generated by this application.
    
    Yes, and it's extremely annoying that k8s allows anybody with access to the pods to read env variables, instead of just the pod owner. In fact, it doesn't even seem to have the concept of who owns the pod.
    
    Anyway, this isn't different from someone else being able to read secrets in the same namespace as the pod.
    
    As I said before, it all depends on how you configure your cluster for security, and in k8s there seems to be a lot of different options.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > The way it's written now
    
    Code can change after it's written...
    
    >  If this change is merged into 3.x without any other changes, users will be forced to use the K8s secret based 
    
    If this change is not merged, users have no way to use authentication at all. So I don't see your point.
    
    It will prevent you from using the Vault approach in the sense that support for that won't be implemented yet. But this change, again, does not put a block on adding support for what you're saying later. If you think that's important and want to do that in 3.x, then you're free to. But I don't see why this approach needs to be blocked because it doesn't cover the Vault use case.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    The trouble is the API proposed here and how it would have to change for future features. If we wanted to add the optionality to support authentication via mounted files later, then what's the API for that, and how would that change the API for users that were relying on this authentication mechanism? That's why it's important to see the optionality now, so it can be clear to us that <X> are our options, and this is how we are going to use them.
    
    A proposed scheme is to have `spark.authenticate.k8s.secret.provider=autok8ssecret`, then document what that does. Perhaps that's the default mode. Then add another scheme, say `spark.authenticate.k8s.secret.provider=files` and then further options for specifying where that file is located on both the driver and the executors.
    
    It's helpful to put this patch in the context of where we want to go for authentication in general moving forward. Otherwise this feature taken in isolation will make it appear that Spark is being opinionated about using Kubernetes secrets and environment variables for authentication.
    
    if it's not introduced in this patch, then at least we should file JIRA tickets and reference them as future add-ons to this and have a roadmap for what SASL on K8s will look like in the bigger picture for 3.x.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5476/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99634/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test status success
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5476/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > It matters because we're discussing direction
    
    I'm not, you guys are. I'm adding a missing feature with one particular implementation. If you want to add other implementations that enable different use cases, great.
    
    > we're effectively communicating that Spark is locked in to the authentication backed by K8s secrets
    
    We're not locking into anything, and that's basically where I strongly disagree with you. You're free to add new ways, and when that's done, you're not "locked in" anymore.
    
    Locked in would mean that pushing this PR means you cannot make changes to it later, and that's just not true.
    
    Right now you're "locked in" to no auth at all, but somehow that's ok?
    
    > check that work on SPARK-26239 would work nicely with it
    
    Anything needed to implement that feature is just code changes. Whether it "works nicely" is just a matter of not breaking this when that feature is implemented.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    I think as long as we have one alternate mechanism proposed in SPARK-26239 this is ok to merge. I proposed one in [this comment](https://issues.apache.org/jira/browse/SPARK-26239?focusedCommentId=16705273&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16705273).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    Kubernetes integration test status failure
    URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/5687/



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    > There doesn't need to be a single solution. This patch going in does not preclude adding more features later, one of which might be reading this from a pre-defined secret.
    
    The way it's written now, if the user opts-in to using `spark.authenticate` with Kubernetes, the application will _always_ automatically generate the secret and use that as the security mechanism. I think we'd prefer to see the various options that are available up front and this patch should probably introduce both the automatic secret creation version (if we agree that this is secure) and the manual provision version. If this change is merged into 3.x without any other changes, users will be forced to use the K8s secret based SASL scheme and this feature will be unusable for other users such as with the vault case discussed above.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #23174: [SPARK-26194][k8s] Auto generate auth secret for ...

Posted by mccheah <gi...@git.apache.org>.
Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23174#discussion_r239702559
  
    --- Diff: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ---
    @@ -87,44 +88,61 @@ private[spark] class BasicExecutorFeatureStep(kubernetesConf: KubernetesExecutor
         val executorCpuQuantity = new QuantityBuilder(false)
           .withAmount(executorCoresRequest)
           .build()
    -    val executorExtraClasspathEnv = executorExtraClasspath.map { cp =>
    -      new EnvVarBuilder()
    -        .withName(ENV_CLASSPATH)
    -        .withValue(cp)
    -        .build()
    -    }
    -    val executorExtraJavaOptionsEnv = kubernetesConf
    -      .get(EXECUTOR_JAVA_OPTIONS)
    -      .map { opts =>
    -        val subsOpts = Utils.substituteAppNExecIds(opts, kubernetesConf.appId,
    -          kubernetesConf.executorId)
    -        val delimitedOpts = Utils.splitCommandString(subsOpts)
    -        delimitedOpts.zipWithIndex.map {
    -          case (opt, index) =>
    -            new EnvVarBuilder().withName(s"$ENV_JAVA_OPT_PREFIX$index").withValue(opt).build()
    +
    +    val executorEnv: Seq[EnvVar] = {
    +        (Seq(
    +          (ENV_DRIVER_URL, driverUrl),
    +          (ENV_EXECUTOR_CORES, executorCores.toString),
    +          (ENV_EXECUTOR_MEMORY, executorMemoryString),
    +          (ENV_APPLICATION_ID, kubernetesConf.appId),
    +          // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    +          (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    +          (ENV_EXECUTOR_ID, kubernetesConf.executorId)
    +        ) ++ kubernetesConf.environment).map { case (k, v) =>
    +          new EnvVarBuilder()
    +            .withName(k)
    +            .withValue(v)
    +            .build()
             }
    -      }.getOrElse(Seq.empty[EnvVar])
    -    val executorEnv = (Seq(
    -      (ENV_DRIVER_URL, driverUrl),
    -      (ENV_EXECUTOR_CORES, executorCores.toString),
    -      (ENV_EXECUTOR_MEMORY, executorMemoryString),
    -      (ENV_APPLICATION_ID, kubernetesConf.appId),
    -      // This is to set the SPARK_CONF_DIR to be /opt/spark/conf
    -      (ENV_SPARK_CONF_DIR, SPARK_CONF_DIR_INTERNAL),
    -      (ENV_EXECUTOR_ID, kubernetesConf.executorId)) ++
    -      kubernetesConf.environment)
    -      .map(env => new EnvVarBuilder()
    -        .withName(env._1)
    -        .withValue(env._2)
    -        .build()
    -      ) ++ Seq(
    -      new EnvVarBuilder()
    -        .withName(ENV_EXECUTOR_POD_IP)
    -        .withValueFrom(new EnvVarSourceBuilder()
    -          .withNewFieldRef("v1", "status.podIP")
    +      } ++ {
    +        Seq(new EnvVarBuilder()
    +          .withName(ENV_EXECUTOR_POD_IP)
    +          .withValueFrom(new EnvVarSourceBuilder()
    +            .withNewFieldRef("v1", "status.podIP")
    +            .build())
               .build())
    -        .build()
    -    ) ++ executorExtraJavaOptionsEnv ++ executorExtraClasspathEnv.toSeq
    +      } ++ {
    +        Option(secMgr.getSecretKey()).map { authSecret =>
    +          new EnvVarBuilder()
    +            .withName(SecurityManager.ENV_AUTH_SECRET)
    +            .withValue(authSecret)
    --- End diff --
    
    I filed https://issues.apache.org/jira/browse/SPARK-26301 to suggest the alternative scheme. Unlike SPARK-26139 this would change the functionality that was merged here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #23174: [SPARK-26194][k8s] Auto generate auth secret for k8s app...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/23174
  
    >  if the secret would be listed under the environment variables in the Spark UI 
    
    Secrets are redacted in the UI and event logs. We already use env variables in other contexts (e.g. standalone with auth enabled).
    
    Environment variables don't leak unless you leak them. If you do, it's a security problem in your code, since the env is generally considered "sensitive information". They're not written to disk, unlike files, which some people have problems with (really paranoid orgs don't want sensitive information in unencrypted files on disk).
    
    This could be stashed in a k8s secret, but then how does the client mode driver get it? More user configuration? That's exactly what this is trying to avoid.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org