You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by abbaspour <gi...@git.apache.org> on 2017/01/16 03:17:47 UTC

[GitHub] hive pull request #133: Enable webhcat to run JDBC connection for Hive DDL q...

GitHub user abbaspour opened a pull request:

    https://github.com/apache/hive/pull/133

    Enable webhcat to run JDBC connection for Hive DDL queries

    This is a change in `HcatDelegator` to run Hive DDL queries over **JDBC** connection in contrast to slow `hcat` command.
    
    Motivation is basically speed. The way `HcatDelegator` launches new `hcat` scripts per call makes it unsuitable for interactive REST calls. 
    
    This change speeds up /ddl queries from normally 10-20 sec down to few milliseconds. No connection pooling is in place to make the RP small but that can be added anytime. 
    
    Also being JDBC connection, this is pretty secure and compatible with all access policies define in Hive server2. User does not have visibility over other databases (which used to be the case in `hcat` mode.)
    
    To switch to JDBC mode simply add this configuration to **webhcat-site.xml**
    
    ```xml
    <property>
          <name>templeton.ddl.mode</name>
          <value>jdbc</value>
    </property>
    <property>
          <name>hive.jdbc.url</name>
          <value>jdbc:hive2://server:port</value>
    </property>
    ```
    
    For secure environments we also need these attributes in webhcat-site.xml configuration:
    
    ```xml
        <property>
          <name>hive.server2.kerberos.keytab</name>
          <value>/etc/security/keytabs/hive.service.keytab</value>
        </property>
    
        <property>
          <name>hive.server2.kerberos.principal</name>
          <value>hive/_HOST@{REALM}</value>
        </property>
    ```
    
    This change uses Hive DDL JSON output so that should be allowed in **hiveserver2-site.xml**
    
    ```xml
        <property>
          <name>hive.security.authorization.sqlstd.confwhitelist.append</name>
          <value>hive.ddl.output.format</value>
        </property>
    ```
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/datarepublic/hive master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hive/pull/133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #133
    
----
commit 39cfb08044131b5422fd6e406bae8221a6b75011
Author: Amin Abbaspour <am...@consultants.datareplic.io>
Date:   2017-01-16T02:57:20Z

    Enable webhcat to run JDBC connection for Hive DDL queries

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---