You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by zj...@apache.org on 2020/11/11 05:15:23 UTC

[zeppelin] branch branch-0.9 updated: [ZEPPELIN-5126]. Allow user to specify spark.yarn.keytab and spark.yarn.principal for user impersonation

This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch branch-0.9
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/branch-0.9 by this push:
     new a91dfc8  [ZEPPELIN-5126]. Allow user to specify spark.yarn.keytab and spark.yarn.principal for user impersonation
a91dfc8 is described below

commit a91dfc89f1023db8bc8cd72d1670c6a24905c560
Author: Jeff Zhang <zj...@apache.org>
AuthorDate: Tue Nov 10 10:26:07 2020 +0800

    [ZEPPELIN-5126]. Allow user to specify spark.yarn.keytab and spark.yarn.principal for user impersonation
    
    ### What is this PR for?
    
    This is to improve the spark kerbose support in both non-user impersonation and user impersonation case.
    User just need to specify either specify `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal` in zeppelin-site.xml  or specify the spark standard setting `spark.yarn.keytab` and `spark.yarn.principal` in spark interpreter setting.
    
    ### What type of PR is it?
    [Improvement]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    * https://issues.apache.org/jira/browse/ZEPPELIN-5126
    
    ### How should this be tested?
    * CI pass
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Jeff Zhang <zj...@apache.org>
    
    Closes #3967 from zjffdu/ZEPPELIN-5126 and squashes the following commits:
    
    3fc83ee7b [Jeff Zhang] [ZEPPELIN-5126]. Allow user to specify spark.yarn.keytab and spark.yarn.principal for user impersonation
    
    (cherry picked from commit aff90f46c86bc5a4a986d984018a9d27f6319a40)
    Signed-off-by: Jeff Zhang <zj...@apache.org>
---
 docs/interpreter/spark.md                              | 18 ++++++++++++++----
 .../interpreter/launcher/SparkInterpreterLauncher.java | 17 ++++++++++-------
 .../launcher/SparkInterpreterLauncherTest.java         | 10 ++++------
 3 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/docs/interpreter/spark.md b/docs/interpreter/spark.md
index 537ae60..105cc74 100644
--- a/docs/interpreter/spark.md
+++ b/docs/interpreter/spark.md
@@ -456,6 +456,20 @@ e.g.
 Zeppelin automatically injects `ZeppelinContext` as variable `z` in your Scala/Python environment. `ZeppelinContext` provides some additional functions and utilities.
 See [Zeppelin-Context](../usage/other_features/zeppelin_context.html) for more details.
 
+## Setting up Zeppelin with Kerberos
+Logical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:
+
+<img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/kdc_zeppelin.png">
+
+There're several ways to make spark work with kerberos enabled hadoop cluster in Zeppelin. 
+
+1. Share one single hadoop cluster.
+In this case you just need to specify `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal` in zeppelin-site.xml, Spark interpreter will use these setting by default.
+
+2. Work with multiple hadoop clusters.
+In this case you can specify `spark.yarn.keytab` and `spark.yarn.principal` to override `zeppelin.server.kerberos.keytab` and `zeppelin.server.kerberos.principal`.
+
+
 ## User Impersonation
 
 In yarn mode, the user who launch the zeppelin server will be used to launch the spark yarn application. This is not a good practise.
@@ -482,10 +496,6 @@ you need to enable user impersonation for more security control. In order the en
 impersonate in `zeppelin-site.xml`.
 
 
-## Setting up Zeppelin with Kerberos
-Logical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:
-
-<img src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/kdc_zeppelin.png">
 
 ## Deprecate Spark 2.2 and earlier versions
 Starting from 0.9, Zeppelin deprecate Spark 2.2 and earlier versions. So you will see a warning message when you use Spark 2.2 and earlier.
diff --git a/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java b/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java
index dba9b03..e25e7da 100644
--- a/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java
+++ b/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java
@@ -163,12 +163,14 @@ public class SparkInterpreterLauncher extends StandardInterpreterLauncher {
       }
     }
 
-    for (String name : sparkProperties.stringPropertyNames()) {
-      sparkConfBuilder.append(" --conf " + name + "=" + sparkProperties.getProperty(name));
-    }
-
     if (context.getOption().isUserImpersonate() && zConf.getZeppelinImpersonateSparkProxyUser()) {
       sparkConfBuilder.append(" --proxy-user " + context.getUserName());
+      sparkProperties.remove("spark.yarn.keytab");
+      sparkProperties.remove("spark.yarn.principal");
+    }
+
+    for (String name : sparkProperties.stringPropertyNames()) {
+      sparkConfBuilder.append(" --conf " + name + "=" + sparkProperties.getProperty(name));
     }
 
     env.put("ZEPPELIN_SPARK_CONF", sparkConfBuilder.toString());
@@ -185,9 +187,10 @@ public class SparkInterpreterLauncher extends StandardInterpreterLauncher {
       }
     }
 
-    String keytab = zConf.getString(ZeppelinConfiguration.ConfVars.ZEPPELIN_SERVER_KERBEROS_KEYTAB);
-    String principal =
-        zConf.getString(ZeppelinConfiguration.ConfVars.ZEPPELIN_SERVER_KERBEROS_PRINCIPAL);
+    String keytab = properties.getProperty("spark.yarn.keytab",
+            zConf.getString(ZeppelinConfiguration.ConfVars.ZEPPELIN_SERVER_KERBEROS_KEYTAB));
+    String principal = properties.getProperty("spark.yarn.principal",
+            zConf.getString(ZeppelinConfiguration.ConfVars.ZEPPELIN_SERVER_KERBEROS_PRINCIPAL));
 
     if (!StringUtils.isBlank(keytab) && !StringUtils.isBlank(principal)) {
       env.put("ZEPPELIN_SERVER_KERBEROS_KEYTAB", keytab);
diff --git a/zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncherTest.java b/zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncherTest.java
index 5195710..6aff86a 100644
--- a/zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncherTest.java
+++ b/zeppelin-zengine/src/test/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncherTest.java
@@ -253,14 +253,13 @@ public class SparkInterpreterLauncherTest {
             zeppelinHome + "/interpreter/zeppelin-interpreter-shaded-" + Util.getVersion() + ".jar";
     String sparkrZip = sparkHome + "/R/lib/sparkr.zip#sparkr";
     String sparkFiles = "file_1," + zeppelinHome + "/conf/log4j_yarn_cluster.properties";
-    assertEquals(" --conf spark.yarn.dist.archives=" + sparkrZip +
+    assertEquals(" --proxy-user user1 --conf spark.yarn.dist.archives=" + sparkrZip +
             " --conf spark.yarn.isPython=true --conf spark.app.name=intpGroupId" +
             " --conf spark.yarn.maxAppAttempts=1" +
             " --conf spark.master=yarn" +
             " --conf spark.files=" + sparkFiles + " --conf spark.jars=" + sparkJars +
             " --conf spark.submit.deployMode=cluster" +
-            " --conf spark.yarn.submit.waitAppCompletion=false" +
-            " --proxy-user user1",
+            " --conf spark.yarn.submit.waitAppCompletion=false",
             interpreterProcess.getEnv().get("ZEPPELIN_SPARK_CONF"));
     Files.deleteIfExists(Paths.get(localRepoPath.toAbsolutePath().toString(), "test.jar"));
     FileUtils.deleteDirectory(localRepoPath.toFile());
@@ -302,15 +301,14 @@ public class SparkInterpreterLauncherTest {
     String sparkrZip = sparkHome + "/R/lib/sparkr.zip#sparkr";
     // escape special characters
     String sparkFiles = "{}," + zeppelinHome + "/conf/log4j_yarn_cluster.properties";
-    assertEquals(" --conf spark.yarn.dist.archives=" + sparkrZip +
+    assertEquals(" --proxy-user user1 --conf spark.yarn.dist.archives=" + sparkrZip +
                     " --conf spark.yarn.isPython=true" +
                     " --conf spark.app.name=intpGroupId" +
                     " --conf spark.yarn.maxAppAttempts=1" +
                     " --conf spark.master=yarn" +
                     " --conf spark.files=" + sparkFiles + " --conf spark.jars=" + sparkJars +
                     " --conf spark.submit.deployMode=cluster" +
-                    " --conf spark.yarn.submit.waitAppCompletion=false" +
-                    " --proxy-user user1",
+                    " --conf spark.yarn.submit.waitAppCompletion=false",
             interpreterProcess.getEnv().get("ZEPPELIN_SPARK_CONF"));
     FileUtils.deleteDirectory(localRepoPath.toFile());
   }