You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2022/01/03 16:54:05 UTC

[GitHub] [hive] abstractdog opened a new pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

abstractdog opened a new pull request #2911:
URL: https://github.com/apache/hive/pull/2911


   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description, screenshot and/or a reproducable example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Hive versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on pull request #2911:
URL: https://github.com/apache/hive/pull/2911#issuecomment-1008746269


   can I get a review on this @nareshpr , @vihangk1 , @deniskuzZ , @szlta 
   tested on cluster, customer is waiting for the fix


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781167272



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java
##########
@@ -350,6 +351,16 @@ protected void openInternal(String[] additionalFilesNotFromConf,
 
     setupSessionAcls(tezConfig, conf);
 
+    /*
+     * Update HADOOP_CREDSTORE_PASSWORD for the TezAM.
+     * If there is a job specific credential store, it will be set.
+     * HiveConfUtil.updateJobCredentialProviders should not be used here,
+     * as it changes the credential store path too, which causes the dag submission fail,
+     * as this config has an effect in HS2 (on TezClient codepath), and the original hadoop
+     * credential store should be used.
+     */
+    HiveConfUtil.updateCredentialProviderPasswordForJobs(tezConfig);

Review comment:
       Do you know if that information won't be localized by Tez in plain text?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r777886508



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -199,16 +199,16 @@ public static void updateJobCredentialProviders(Configuration jobConf) {
     if (credstorePassword != null) {
       String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
 
-      if ("mr".equalsIgnoreCase(execEngine)) {
-        // if the execution engine is MR set the map/reduce env with the credential store password
+      if ("mr".equalsIgnoreCase(execEngine) || "tez".equalsIgnoreCase(execEngine)) {
+        // if the execution engine is MR/Tez set the map/reduce env with the credential store password
 
         Collection<String> redactedProperties =
             jobConf.getStringCollection(MRJobConfig.MR_JOB_REDACTED_PROPERTIES);
-
         Stream.of(
             JobConf.MAPRED_MAP_TASK_ENV,
             JobConf.MAPRED_REDUCE_TASK_ENV,
-            MRJobConfig.MR_AM_ADMIN_USER_ENV)
+            MRJobConfig.MR_AM_ADMIN_USER_ENV,
+            "tez.am.launch.env")

Review comment:
       thanks @nareshpr, I'll double-check, I tested this patch on a cluster as a POC, and I saw a simple map/reduce dag worked
   also, I'm going to decide how to use this as referring to TezConfiguration.TEZ_AM_LAUNCH_ENV, which is not visible currently in hive-common




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781552671



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -195,20 +196,31 @@ public static void updateJobCredentialProviders(Configuration jobConf) {
           + " previous location was " + oldKeyStoreLocation);
     }
 
+    updateCredentialProviderPasswordForJobs(jobConf);
+  }
+
+  public static void updateCredentialProviderPasswordForJobs(Configuration jobConf) {
     String credstorePassword = getJobCredentialProviderPassword(jobConf);
     if (credstorePassword != null) {
       String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
 
-      if ("mr".equalsIgnoreCase(execEngine)) {
-        // if the execution engine is MR set the map/reduce env with the credential store password
-
+      if ("mr".equalsIgnoreCase(execEngine) || "tez".equalsIgnoreCase(execEngine)) {

Review comment:
       I don't think so :) I think we should introduce it someday and remove the usage of this string madness from at least hive-exec




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] deniskuzZ commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
deniskuzZ commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781148223



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -195,20 +196,31 @@ public static void updateJobCredentialProviders(Configuration jobConf) {
           + " previous location was " + oldKeyStoreLocation);
     }
 
+    updateCredentialProviderPasswordForJobs(jobConf);
+  }
+
+  public static void updateCredentialProviderPasswordForJobs(Configuration jobConf) {
     String credstorePassword = getJobCredentialProviderPassword(jobConf);
     if (credstorePassword != null) {
       String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
 
-      if ("mr".equalsIgnoreCase(execEngine)) {
-        // if the execution engine is MR set the map/reduce env with the credential store password
-
+      if ("mr".equalsIgnoreCase(execEngine) || "tez".equalsIgnoreCase(execEngine)) {

Review comment:
       do we have enums for the execution engines?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781557895



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java
##########
@@ -350,6 +351,16 @@ protected void openInternal(String[] additionalFilesNotFromConf,
 
     setupSessionAcls(tezConfig, conf);
 
+    /*
+     * Update HADOOP_CREDSTORE_PASSWORD for the TezAM.
+     * If there is a job specific credential store, it will be set.
+     * HiveConfUtil.updateJobCredentialProviders should not be used here,
+     * as it changes the credential store path too, which causes the dag submission fail,
+     * as this config has an effect in HS2 (on TezClient codepath), and the original hadoop
+     * credential store should be used.
+     */
+    HiveConfUtil.updateCredentialProviderPasswordForJobs(tezConfig);

Review comment:
       the credential store password appears in the launch-container.sh script, which is created by yarn, and it contains the environment variables, I think this is the same as in case of any other execution engine that's localized by yarn, but I'm sure that launch-container.sh is not included into application logs, so it's only readable of somebody having access to the cluster nodes
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781552671



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -195,20 +196,31 @@ public static void updateJobCredentialProviders(Configuration jobConf) {
           + " previous location was " + oldKeyStoreLocation);
     }
 
+    updateCredentialProviderPasswordForJobs(jobConf);
+  }
+
+  public static void updateCredentialProviderPasswordForJobs(Configuration jobConf) {
     String credstorePassword = getJobCredentialProviderPassword(jobConf);
     if (credstorePassword != null) {
       String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
 
-      if ("mr".equalsIgnoreCase(execEngine)) {
-        // if the execution engine is MR set the map/reduce env with the credential store password
-
+      if ("mr".equalsIgnoreCase(execEngine) || "tez".equalsIgnoreCase(execEngine)) {

Review comment:
       I don't think so :) I think we should introduce it someday and remove the usage of this string madness from at least hive-exec
   I hope it's fine to go with this in the scope of this fix




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on pull request #2911:
URL: https://github.com/apache/hive/pull/2911#issuecomment-1009749956


   test failures are due to https://issues.apache.org/jira/browse/HIVE-25859


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r781557895



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java
##########
@@ -350,6 +351,16 @@ protected void openInternal(String[] additionalFilesNotFromConf,
 
     setupSessionAcls(tezConfig, conf);
 
+    /*
+     * Update HADOOP_CREDSTORE_PASSWORD for the TezAM.
+     * If there is a job specific credential store, it will be set.
+     * HiveConfUtil.updateJobCredentialProviders should not be used here,
+     * as it changes the credential store path too, which causes the dag submission fail,
+     * as this config has an effect in HS2 (on TezClient codepath), and the original hadoop
+     * credential store should be used.
+     */
+    HiveConfUtil.updateCredentialProviderPasswordForJobs(tezConfig);

Review comment:
       the credential store password appears in the launch-container.sh script, which is created by yarn, and it contains the environment variables, I think this should be the same as in case of any other execution engines that's localized by yarn, but I'm sure that launch-container.sh is not included into application logs, so it's only readable of somebody having access to the cluster nodes
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on pull request #2911:
URL: https://github.com/apache/hive/pull/2911#issuecomment-1010274209


   > I would expect to add the TEZ case as part of https://github.com/apache/hive/blob/master/ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java
   > 
   > This can also be done as a follow up -- changes look straightforward to me
   
   absolutely makes sense, added corresponding cases to the existing unit tests
   if it passes in precommit, I'll merge this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] nareshpr commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
nareshpr commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r777712804



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -199,16 +199,16 @@ public static void updateJobCredentialProviders(Configuration jobConf) {
     if (credstorePassword != null) {
       String execEngine = jobConf.get(ConfVars.HIVE_EXECUTION_ENGINE.varname);
 
-      if ("mr".equalsIgnoreCase(execEngine)) {
-        // if the execution engine is MR set the map/reduce env with the credential store password
+      if ("mr".equalsIgnoreCase(execEngine) || "tez".equalsIgnoreCase(execEngine)) {
+        // if the execution engine is MR/Tez set the map/reduce env with the credential store password
 
         Collection<String> redactedProperties =
             jobConf.getStringCollection(MRJobConfig.MR_JOB_REDACTED_PROPERTIES);
-
         Stream.of(
             JobConf.MAPRED_MAP_TASK_ENV,
             JobConf.MAPRED_REDUCE_TASK_ENV,
-            MRJobConfig.MR_AM_ADMIN_USER_ENV)
+            MRJobConfig.MR_AM_ADMIN_USER_ENV,
+            "tez.am.launch.env")

Review comment:
       Is this credential required for tez.task.launch.env ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on pull request #2911:
URL: https://github.com/apache/hive/pull/2911#issuecomment-1010217059


   can I get approval or rejection on this one? looks like there is an extreme customer push on this case
   
   test failures are all due to https://issues.apache.org/jira/browse/HIVE-25859


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r782442107



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -244,6 +256,16 @@ public static String getJobCredentialProviderPassword(Configuration conf) {
     return null;
   }
 
+  /**
+   * Sets a "keyName=newKeyValue" pair to a jobConf to a given property.
+   * If the property is empty, is simply inserts keyName=newKeyValue,

Review comment:
       going to fix it before committing




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] pgaref commented on a change in pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
pgaref commented on a change in pull request #2911:
URL: https://github.com/apache/hive/pull/2911#discussion_r782417105



##########
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java
##########
@@ -244,6 +256,16 @@ public static String getJobCredentialProviderPassword(Configuration conf) {
     return null;
   }
 
+  /**
+   * Sets a "keyName=newKeyValue" pair to a jobConf to a given property.
+   * If the property is empty, is simply inserts keyName=newKeyValue,

Review comment:
       typo: it simply




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org


[GitHub] [hive] abstractdog merged pull request #2911: HIVE-25829: Tez exec mode support for credential provider for jobs

Posted by GitBox <gi...@apache.org>.
abstractdog merged pull request #2911:
URL: https://github.com/apache/hive/pull/2911


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org