You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Vihang Karajgaonkar <vi...@cloudera.com> on 2016/09/28 00:15:48 UTC

Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

Review request for hive, Mohit Sabharwal and Sergio Pena.


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs
-----

  common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java ce1106d91db9ef75e7b425d5950f888bacbfb3e5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.

> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java, line 25
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523416#file1523416line25>
> >
> >     Is this actually mocked anywhere in the tests ? I see the tests mock the env via a HashMap.
> >     
> >     If the use of this interface is just to mock, I'm not sure if there is a good reason to use in the non-test code. Could we just directly use System.getenv there ? i.e. we can get rid of this file altogether.

Initially I had thought of using this interface to mock Environment variables. But found a easier alternative later for tests. Removed it.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, line 140
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523420#file1523420line140>
> >
> >     "needs to be " -> "is" ?

What I tried to convey is the password file needs to be readable by the consumers. Changed it to "which needs to be readable by all consumers" to be more clear. Let me know if you have any other suggestions


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, lines 146-151
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523420#file1523420line146>
> >
> >     Could you make the following clear:
> >     
> >     Either:
> >     (1) HIVE_SERVER2_JOB_CREDSTORE_LOCATION should be configured AND HIVE_JOB_CREDSTORE_PASSWORD environment should be set.
> >     OR
> >     (2) HADOOP_CREDSTORE_PASSWORD environment should be set.
> >     
> >     IOW It's important to state that HIVE_SERVER2_JOB_CREDSTORE_LOCATION and HADOOP_CREDSTORE_PASSWORD do not work together.

HIVE_SERVER2_JOB_CREDSTORE_LOCATION does work with HADOOP_CREDSTORE_PASSWORD if the user wishes to use it that way. The code actually uses HIVE_JOB_CREDSTORE_PASSWORD if it is set otherwise uses HADOOP_CREDSTORE_PASSWORD. It checks for HIVE_JOB_CREDSTORE_PASSWORD only when HIVE_SERVER2_JOB_CREDSTORE_LOCATION is set. If both HADOOP_CREDSTORE_PASSWORD and HIVE_JOB_CREDSTORE_PASSWORD are not set, it is still a valid configuration since Hadoop credential provider will use the default password of "none" in that case. I have changed the wording to explicitly convey this. Hopefully it is clearer now.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, line 179
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523420#file1523420line179>
> >
> >     If both HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, what happens ? Should we throw an exception ?
> >     
> >     What if HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, but HIVE_SERVER2_JOB_CREDSTORE_LOCATION is set ?

It is still a valid configuration since hadoop credential provider uses the default password of "none" in that case. I have add that in the method documentation above.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, lines 203-204
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523420#file1523420line203>
> >
> >     Wondering if this should really be a RuntimeException, since query is pretty much hosed at this point, correct ? i.e. rollback the stack and abandon this query with an exception.

You are right. There is no point in continuing since the query will fail anyways. Changed it to throw RuntimeException


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java, lines 201-202
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523423#file1523423line201>
> >
> >     Needs space after //
> >     
> >     What happens to HIVE_SERVER2_JOB_CREDSTORE_LOCATION ? Does that need to be set or checked if it is set ? I mean, do both HIVE_SERVER2_JOB_CREDSTORE_PASSWORD_ENVVAR and  HIVE_SERVER2_JOB_CREDSTORE_LOCATION need to be set for former to be used ?
> >     
> >     It's worth clarifying here how Spark implementation differs from MR. I am assuming this step is in addition to the updateJobCredentialProviders invocation in Spark.

HIVE_SERVER2_JOB_CREDSTORE_PASSWORD_ENVVAR does not need to be set in order for HIVE_SERVER2_JOB_CREDSTORE_LOCATION to work like explained in one of my comments above. The difference in Spark implementation and MR in this case is the environment variables for Spark are set using SparkConf Map in the HiveSparkClientFactory unlike MR. The HIVE_SERVER2_JOB_CREDSTORE_LOCATION is set directly in the jobConfig similar to MR in the execute method of LocalSparkClient and RemoteSparkClient. I have updated the comments to say the same.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, line 188
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523420#file1523420line188>
> >
> >     Is this ok if this is null or empty ?

I did some more investigation I made some changes to the method. It gets rid of the return type since it was not really needed in the first place.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, lines 422-426
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523422#file1523422line422>
> >
> >     Easier: job.set(Constants.HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG, origKeystoreLocation == null: "", origKeystoreLocation)
> >     
> >     Wait, shouldn't we set the config before submitting the job ?

This check is removed now since it was not needed in the first place.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, lines 793-794
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523422#file1523422line793>
> >
> >     are you fixing something that was broken ? JOBCONF_FILENAME was earlier local but you to want to allow it on any fs ?

I wanted to move the jobConf.xml to HDFS from the local filesystem since I thought it would be more secure. But I have now revert these changes since it had some other issues which I discovered later. Also, it is not actually needed since jobConf.xml does not contain anything different than earlier (before this patch). The password are injected in the jobConf in the runtime using environment variables so it is safe to revert back to original implementation of this method.


> On Oct. 6, 2016, 8:56 p.m., Mohit Sabharwal wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java, line 103
> > <https://reviews.apache.org/r/52283/diff/4/?file=1523426#file1523426line103>
> >
> >     Are we testing the case where both HADOOP_CREDENTIAL_PASSWORD_ENVVAR and HIVE_JOB_CREDSTORE_PASSWORD_ENVVAR_VAL are not set ?

I havent yet tested this case, I will test it and add a test case if possible.


- Vihang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review151693
-----------------------------------------------------------


On Oct. 12, 2016, 5:44 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2016, 5:44 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review151693
-----------------------------------------------------------



Thanks for the changes! Couple more questions.

For readability, please add a blank line before every new block.

Hive follows Sun's convention (except uses 100char line limit instead of 80):
  http://web.archive.org/web/20140228225807/http://www.oracle.com/technetwork/java/codeconventions-150003.pdf


common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java (line 25)
<https://reviews.apache.org/r/52283/#comment220191>

    Is this actually mocked anywhere in the tests ? I see the tests mock the env via a HashMap.
    
    If the use of this interface is just to mock, I'm not sure if there is a good reason to use in the non-test code. Could we just directly use System.getenv there ? i.e. we can get rid of this file altogether.



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 2524)
<https://reviews.apache.org/r/52283/#comment220140>

    run using MR or Spark execution engines. This functionality has not been tested with Tez.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 140)
<https://reviews.apache.org/r/52283/#comment220142>

    "needs to be " -> "is" ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 141)
<https://reviews.apache.org/r/52283/#comment220144>

    nit: "therefore not supported"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 144)
<https://reviews.apache.org/r/52283/#comment220145>

    nit: "job" -> "MR Job"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 146 - 151)
<https://reviews.apache.org/r/52283/#comment220148>

    Could you make the following clear:
    
    Either:
    (1) HIVE_SERVER2_JOB_CREDSTORE_LOCATION should be configured AND HIVE_JOB_CREDSTORE_PASSWORD environment should be set.
    OR
    (2) HADOOP_CREDSTORE_PASSWORD environment should be set.
    
    IOW It's important to state that HIVE_SERVER2_JOB_CREDSTORE_LOCATION and HADOOP_CREDSTORE_PASSWORD do not work together.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 148)
<https://reviews.apache.org/r/52283/#comment220146>

    "this adds" -> "we use"



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 164)
<https://reviews.apache.org/r/52283/#comment220163>

    Please add single blank line before a new block. Same in other places.



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 173 - 175)
<https://reviews.apache.org/r/52283/#comment220192>

    any reason we can use System.getenv directly ? Same in other places



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 179)
<https://reviews.apache.org/r/52283/#comment220151>

    If both HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, what happens ? Should we throw an exception ?
    
    What if HIVE_JOB_CREDSTORE_PASSWORD and HADOOP_CREDSTORE_PASSWORD are not set, but HIVE_SERVER2_JOB_CREDSTORE_LOCATION is set ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 188)
<https://reviews.apache.org/r/52283/#comment220150>

    Is this ok if this is null or empty ?



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 203 - 204)
<https://reviews.apache.org/r/52283/#comment220155>

    Wondering if this should really be a RuntimeException, since query is pretty much hosed at this point, correct ? i.e. rollback the stack and abandon this query with an exception.



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java (line 91)
<https://reviews.apache.org/r/52283/#comment220156>

    needed ?



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java (line 98)
<https://reviews.apache.org/r/52283/#comment220158>

    needed ?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java (lines 422 - 426)
<https://reviews.apache.org/r/52283/#comment220165>

    Easier: job.set(Constants.HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG, origKeystoreLocation == null: "", origKeystoreLocation)
    
    Wait, shouldn't we set the config before submitting the job ?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 
<https://reviews.apache.org/r/52283/#comment220166>

    Blank line ok before new blocks or comments



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java (lines 790 - 791)
<https://reviews.apache.org/r/52283/#comment220167>

    are you fixing something that was broken ? JOBCONF_FILENAME was earlier local but you to want to allow it on any fs ?



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java (lines 201 - 202)
<https://reviews.apache.org/r/52283/#comment220179>

    Needs space after //
    
    What happens to HIVE_SERVER2_JOB_CREDSTORE_LOCATION ? Does that need to be set or checked if it is set ? I mean, do both HIVE_SERVER2_JOB_CREDSTORE_PASSWORD_ENVVAR and  HIVE_SERVER2_JOB_CREDSTORE_LOCATION need to be set for former to be used ?
    
    It's worth clarifying here how Spark implementation differs from MR. I am assuming this step is in addition to the updateJobCredentialProviders invocation in Spark.



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java (line 117)
<https://reviews.apache.org/r/52283/#comment220180>

    //u -> // U...
    
    same in other places



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 20)
<https://reviews.apache.org/r/52283/#comment220181>

    expand all imports



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 84 - 85)
<https://reviews.apache.org/r/52283/#comment220182>

    nit: inline oldCredStoreLocation here and in other places



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 87)
<https://reviews.apache.org/r/52283/#comment220164>

    Please limit lines to max 100 characters. 
    
    Same in other places.



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 90 - 91)
<https://reviews.apache.org/r/52283/#comment220183>

    again, inlined value easier to read



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 103)
<https://reviews.apache.org/r/52283/#comment220184>

    Are we testing the case where both HADOOP_CREDENTIAL_PASSWORD_ENVVAR and HIVE_JOB_CREDSTORE_PASSWORD_ENVVAR_VAL are not set ?



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java (line 447)
<https://reviews.apache.org/r/52283/#comment220185>

    // -> // A...



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java (lines 450 - 454)
<https://reviews.apache.org/r/52283/#comment220190>

    This pattern of choosing the correct password is repeated couple times in this patch. Should we write a single utility method for this ?


- Mohit Sabharwal


On Oct. 5, 2016, 9:58 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2016, 9:58 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review152739
-----------------------------------------------------------


Ship it!




Ship It!

- Mohit Sabharwal


On Oct. 14, 2016, 6:32 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 14, 2016, 6:32 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 18b98e9f28d363611da9b150ed9480cf350220cf 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   common/src/java/org/apache/hive/common/util/HiveStringUtils.java 507e369bbbae55cf2e6006209558dbb9112cc684 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Oct. 14, 2016, 6:32 p.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

Added review suggestions and renamed the job credential property to hive.server2.job.credential.provider.path so that it is more in line with hadoop credential provider path property name. Created a sub-task HIVE-14962 to add integration tests in a separate change.


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 18b98e9f28d363611da9b150ed9480cf350220cf 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  common/src/java/org/apache/hive/common/util/HiveStringUtils.java 507e369bbbae55cf2e6006209558dbb9112cc684 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.

> On Oct. 13, 2016, 4:20 p.m., Mohit Sabharwal wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java, line 495
> > <https://reviews.apache.org/r/52283/diff/6/?file=1533376#file1533376line495>
> >
> >     static ?

The method uses non-static field conf so cannot change it to static.


> On Oct. 13, 2016, 4:20 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, lines 212-230
> > <https://reviews.apache.org/r/52283/diff/6/?file=1533370#file1533370line212>
> >
> >     I think it is worth isolating this into a utility method in HiveStringUtils, no ? There even might be something there or Utilities.java that you could re-use.

Added a util method in HiveStringUtils for this. It reuses some of the existing methods already there in HiveStringUtils. Thanks for the pointer.


> On Oct. 13, 2016, 4:20 p.m., Mohit Sabharwal wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java, line 187
> > <https://reviews.apache.org/r/52283/diff/6/?file=1533370#file1533370line187>
> >
> >     private

This method is also used in HiveSparkClientFactory which is in a different project so cannot change it to private or package level. Let me know if you any ideas.


- Vihang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review152521
-----------------------------------------------------------


On Oct. 14, 2016, 6:32 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 14, 2016, 6:32 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 18b98e9f28d363611da9b150ed9480cf350220cf 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   common/src/java/org/apache/hive/common/util/HiveStringUtils.java 507e369bbbae55cf2e6006209558dbb9112cc684 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review152521
-----------------------------------------------------------


Fix it, then Ship it!





common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (line 187)
<https://reviews.apache.org/r/52283/#comment221543>

    private



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 212 - 230)
<https://reviews.apache.org/r/52283/#comment221546>

    I think it is worth isolating this into a utility method in HiveStringUtils, no ? There even might be something there or Utilities.java that you could re-use.



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 191)
<https://reviews.apache.org/r/52283/#comment221547>

    extra line



spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java (line 495)
<https://reviews.apache.org/r/52283/#comment221548>

    static ?


- Mohit Sabharwal


On Oct. 12, 2016, 8:34 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2016, 8:34 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Oct. 12, 2016, 8:34 p.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

updated the method documentation


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Oct. 12, 2016, 5:44 p.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

Added Mohit's suggestions. Modified existing implementation to make it simpler. Removed unnecessary code changes.


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Oct. 5, 2016, 9:58 p.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

Added Mohit's suggestions.


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.

> On Oct. 5, 2016, 6:47 p.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java, line 422
> > <https://reviews.apache.org/r/52283/diff/3/?file=1516972#file1516972line422>
> >
> >     Check if string is empty?

I wanted to revert back the original value of the credential provider once the job is submitted. checking empty is also not the right thing to do here. The check should be if it is not null we set it, but if it is null we need to unset it from the configuration. I modified it. Thanks for pointing it out.


> On Oct. 5, 2016, 6:47 p.m., Mohit Sabharwal wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java, lines 88-94
> > <https://reviews.apache.org/r/52283/diff/3/?file=1516976#file1516976line88>
> >
> >     Are you testing the same thing twice ?

Its actually checkes against MAP task env and REDUCE task env. Hopefully it is more readable now after I made the changes you suggested.


- Vihang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review151382
-----------------------------------------------------------


On Oct. 5, 2016, 9:58 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2016, 9:58 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review151382
-----------------------------------------------------------



Took a quick pass. Have some questions and suggestions.

Will take a another pass once those are addressed.


common/src/java/org/apache/hadoop/hive/conf/Constants.java (lines 34 - 35)
<https://reviews.apache.org/r/52283/#comment219968>

    Please provide some hint in the name that this is storing name of an environment variable.
    
    For example: using ENVVAR instead of NAME
    
    HIVE_SERVER2_JOB_CREDSTORE_PASSWORD_ENVVAR
    HADOOP_CREDENTIAL_PASSWORD_ENVVAR



common/src/java/org/apache/hadoop/hive/conf/Constants.java (line 36)
<https://reviews.apache.org/r/52283/#comment219969>

    Same. This is name of config and not the path, right ?
    
    HADOOP_CREDENTIAL_PROVIDER_PATH_CONFIG



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 2522)
<https://reviews.apache.org/r/52283/#comment219970>

    Does this work for Spark ? Tez ? Good to explicitly add all engines that are known to support this functionality. If an engine has not been tested, I think it's ok to state that "this functionality has not been tested against <engine>".



common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java (lines 136 - 137)
<https://reviews.apache.org/r/52283/#comment219972>

    Worth adding why environment variables are used instead of hive configs.
    
    Since this is the main method could you summarize everything you are doing ?
    
    For example:
    
    If the driver is configured with HIVE_SERVER2_JOB_CREDSTORE_LOCATION, 
    AND 
    HIVE_JOB_CREDSTORE_PASSWORD environment is set then:
    (1) JobConf.MAPRED_MAP_TASK_ENV should contain HIVE_SERVER2_JOB_CREDSTORE_LOCATION
    (2) Set hadoop.security.credential.provider.path to HIVE_SERVER2_JOB_CREDSTORE_LOCATION
    
    What do you mean by HIVE_JOB_CREDSTORE_PASSWORD
    or HADOOP_CREDSTORE_PASSWORD ? You mean you'll take either one if set ? What if both are set ?



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java (line 422)
<https://reviews.apache.org/r/52283/#comment219705>

    Check if string is empty?



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 35 - 36)
<https://reviews.apache.org/r/52283/#comment219706>

    nit: please prefix with word "test"
    
    nit: HIVE_JOB_CREDSTORE_PASSWORD_ENVVAR_VAL
    HADOOP_CREDSTORE_PASSWORD_ENVVAR_VAL



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 81 - 82)
<https://reviews.apache.org/r/52283/#comment219703>

    I would use assertEquals here and get rid of the message. It's making these statements difficult to read without adding much value. If this assert fails, we know exactly what happened.  
    
    Same for all other instances in this file.



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 88 - 94)
<https://reviews.apache.org/r/52283/#comment219975>

    Are you testing the same thing twice ?



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 109 - 112)
<https://reviews.apache.org/r/52283/#comment219977>

    This is super unread-able... please make this into one line....



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (lines 333 - 335)
<https://reviews.apache.org/r/52283/#comment219974>

    not need based on prior comment


- Mohit Sabharwal


On Sept. 30, 2016, 6:58 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Sept. 30, 2016, 6:58 p.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Sept. 30, 2016, 6:58 p.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

Addressed review comments and HiveQA test failures.


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Barna Zsombor Klara <zs...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/#review150996
-----------------------------------------------------------



LGTM, only nits and a minor question. Thanks for the patch!


common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java (line 23)
<https://reviews.apache.org/r/52283/#comment219098>

    nit: allows us instead of allows use



ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java (line 49)
<https://reviews.apache.org/r/52283/#comment219101>

    Is JobConf used in this class?
    I can only see the import in the diff.



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 72)
<https://reviews.apache.org/r/52283/#comment219102>

    nit: Can the common initialization be put into an init method annotated with @Before?



ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java (line 148)
<https://reviews.apache.org/r/52283/#comment219103>

    nit: Can we have more meaningful test names than testUpdateCredentialProviders_xx?


- Barna Zsombor Klara


On Sept. 30, 2016, 12:45 a.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52283/
> -----------------------------------------------------------
> 
> (Updated Sept. 30, 2016, 12:45 a.m.)
> 
> 
> Review request for hive, Mohit Sabharwal and Sergio Pena.
> 
> 
> Bugs: HIVE-14822
>     https://issues.apache.org/jira/browse/HIVE-14822
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
>   common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java ce1106d91db9ef75e7b425d5950f888bacbfb3e5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 
> 
> Diff: https://reviews.apache.org/r/52283/diff/
> 
> 
> Testing
> -------
> 
> Testing running multiple queries on S3 with keys stored in a credential provider
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Re: Review Request 52283: HIVE-14822 : Add support for credential provider for jobs launched from Hiveserver2

Posted by Vihang Karajgaonkar <vi...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52283/
-----------------------------------------------------------

(Updated Sept. 30, 2016, 12:45 a.m.)


Review request for hive, Mohit Sabharwal and Sergio Pena.


Changes
-------

Added the credential provider support for distcp called from MoveTask


Bugs: HIVE-14822
    https://issues.apache.org/jira/browse/HIVE-14822


Repository: hive-git


Description
-------

This change adds support for credential provider for jobs launched from HiveServer2. It also adds support for job-specific credential provider and password which is passed to the jobs via the job configs when they are launched from HS2. The hive specific credential provider location is specified by a new configuration property specific to hiveserver2 and the password to job credential provider is provided using the environment variable


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/EnvironmentUtils.java PRE-CREATION 
  common/src/java/org/apache/hadoop/hive/common/FileUtils.java 3ed2d086fd8e14b9a70550c02082c1456f905a08 
  common/src/java/org/apache/hadoop/hive/conf/Constants.java 77c6aa5663c2c5358715801bc039c0fe8035e3dc 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 43a16d7fed1d38400793e7c96a7febebe42d351b 
  common/src/java/org/apache/hadoop/hive/conf/HiveConfUtil.java 16c2eafdb6888ed62168dd00f76335fa2484753b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java fd640567124e1bb8b558359732764a07c8b0f2ed 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cea9582c8ccb0c79700ac7913735d4cdf52f0c7e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java ce1106d91db9ef75e7b425d5950f888bacbfb3e5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveSparkClientFactory.java 784b9c9fa769eeb088e3a6408a0b29147a8b4086 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java c75333d7022f776aab184a4b804fd7ad7befac64 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java a9f70c4100219c8929abd4e31b30d340e6ba98bd 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestHiveCredentialProviders.java PRE-CREATION 
  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 936fdafdb37c461a7a5deb97cba72d4db54a49e1 

Diff: https://reviews.apache.org/r/52283/diff/


Testing
-------

Testing running multiple queries on S3 with keys stored in a credential provider


Thanks,

Vihang Karajgaonkar