You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by omkreddy <gi...@git.apache.org> on 2017/05/04 13:32:24 UTC

[GitHub] storm pull request #2099: STORM-2482: [WIP] Auto populate Hive Credentials

GitHub user omkreddy opened a pull request:

    https://github.com/apache/storm/pull/2099

    STORM-2482:  [WIP] Auto populate Hive Credentials

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/omkreddy/storm AutoHiveNewBranch

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2099
    
----
commit f391d0080794ee714362551ba8f9081af4a0d1f1
Author: Manikumar Reddy O <ma...@gmail.com>
Date:   2017-04-25T11:54:06Z

    Auto populate Hive Credentials

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #2099: STORM-2501: Auto populate Hive Credentials

Posted by omkreddy <gi...@git.apache.org>.
Github user omkreddy commented on the issue:

    https://github.com/apache/storm/pull/2099
  
    @HeartSaVioR @harshach @arunmahadevan  Pl review the PR.   
    
    AutoHive Code contacts with Hive MetaStore to create the tokens. These tokens will be passed to workers. Since Hive depends on HDFS, We also need to configure Hadoop delegations tokens.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r117150582
  
    --- Diff: external/storm-hive/README.md ---
    @@ -99,8 +99,85 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
                     	     		
        StateFactory factory = new HiveStateFactory().withOptions(hiveOptions);
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
    - ```
    +```
        
    +
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +The nimbus should be started with following configurations:
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com"
    +```
    +
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +"cluster1": {"config1": "value1", "config2": "value2", ... } (A map of config key-values specific to cluster1)
    +"cluster2": {"config1": "value1", "hive.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": "cluster2user@EXAMPLE.com", "hive.metastore.uris": "thrift://server:9083"} (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    +
    +hdfsCredentialsConfigKeys : ["cluster1", "cluster2"] (the hdfs clusters you want to fetch the tokens from)
    --- End diff --
    
    @omkreddy 
    Do we need to set cluster name same as hiveCredentialsConfigKeys? 
    If then looks like cluster key-values map should have both hadoop config KVs and hive config KVs.
    If it doesn't, let's use diffrent name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2099


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116670018
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify a 
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.The nimbus should be started with following configurations:
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com" 
    +```
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    +cluster2: [{"config1": "value1", "hive.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": "cluster2user@EXAMPLE.com", "hive.metastore.uris": "thrift://server:9083"}] (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    +
    +hdfsCredentialsConfigKeys : ["cluster1", "cluster2"] (the hdfs clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    --- End diff --
    
    cluster value should be a map. Take a look at hdfs, hbase docs which was fixed recently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #2099: STORM-2482: [WIP] Auto populate Hive Credentials

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/2099
  
    @omkreddy 
    It might be better to file a new issue and change the title accordingly, since I'll mark STORM-2482 as resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116669975
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify a 
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.The nimbus should be started with following configurations:
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com" 
    +```
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    +cluster2: [{"config1": "value1", "hive.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": "cluster2user@EXAMPLE.com", "hive.metastore.uris": "thrift://server:9083"}] (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    --- End diff --
    
    cluster value should be a map. Take a look at hdfs, hbase docs which was fixed recently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #2099: STORM-2501: Auto populate Hive Credentials

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/2099
  
    +1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by omkreddy <gi...@git.apache.org>.
Github user omkreddy commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r117168393
  
    --- Diff: external/storm-hive/README.md ---
    @@ -99,8 +99,85 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
                     	     		
        StateFactory factory = new HiveStateFactory().withOptions(hiveOptions);
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
    - ```
    +```
        
    +
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +The nimbus should be started with following configurations:
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com"
    +```
    +
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +"cluster1": {"config1": "value1", "config2": "value2", ... } (A map of config key-values specific to cluster1)
    +"cluster2": {"config1": "value1", "hive.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": "cluster2user@EXAMPLE.com", "hive.metastore.uris": "thrift://server:9083"} (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    +
    +hdfsCredentialsConfigKeys : ["cluster1", "cluster2"] (the hdfs clusters you want to fetch the tokens from)
    --- End diff --
    
    @HeartSaVioR We can set different cluster names. Updated the docs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116669958
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify a 
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.The nimbus should be started with following configurations:
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com" 
    +```
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    --- End diff --
    
    cluster value should be a map. Take a look at hdfs, hbase docs which was fixed recently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116670032
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify a 
    +hive configs using the methods HiveOptions.withKerberosKeytab(), HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation tokens.The nimbus should be started with following configurations:
    +
    +More details about Hadoop Tokens here: https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +```
    +nimbus.autocredential.plugins.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : ["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive super user that can impersonate other users.)
    +hive.kerberos.principal: "superuser@EXAMPLE.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superuser@EXAMPLE.com" 
    +```
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", "org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, hdfs-site.xml and hive-site.xml) to the classpath, you could specify the configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or -c option while submitting the topology),
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    +cluster2: [{"config1": "value1", "hive.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": "cluster2user@EXAMPLE.com", "hive.metastore.uris": "thrift://server:9083"}] (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    +
    +hdfsCredentialsConfigKeys : ["cluster1", "cluster2"] (the hdfs clusters you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of config key-values specific to cluster1)
    +cluster2: [{"config1": "value1", "hdfs.keytab.file": "/path/to/keytab/for/cluster2/on/nimubs", "hdfs.kerberos.principal": "cluster2user@EXAMPLE.com"}] (here along with other configs, we have custom keytab and principal for "cluster2" which will override the keytab/principal specified at topology level)
    --- End diff --
    
    cluster value should be a map. Take a look at hdfs, hbase docs which was fixed recently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #2099: STORM-2501: Auto populate Hive Credentials

Posted by omkreddy <gi...@git.apache.org>.
Github user omkreddy commented on the issue:

    https://github.com/apache/storm/pull/2099
  
    @arunmahadevan Thanks for the review. Updated the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #2099: STORM-2501: Auto populate Hive Credentials

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116669778
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential worker hosts then you can use this method. You should specify a 
    --- End diff --
    
    specify a -> specify


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---