You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jakub Pastuszek (JIRA)" <ji...@apache.org> on 2015/09/04 11:23:45 UTC

[jira] [Created] (HIVE-11734) Hive Server2 not impersonating HDFS for CREATE TABLE/DATABASE with KERBEROS auth

Jakub Pastuszek created HIVE-11734:
--------------------------------------

             Summary: Hive Server2 not impersonating HDFS for CREATE TABLE/DATABASE with KERBEROS auth
                 Key: HIVE-11734
                 URL: https://issues.apache.org/jira/browse/HIVE-11734
             Project: Hive
          Issue Type: Bug
          Components: Authorization
    Affects Versions: 1.1.1
            Reporter: Jakub Pastuszek


My configuration is as follows:
{code}
hive-site.xml:
hive.server2.enable.doAs=true
hive.metastore.execute.setugi=true
hive.security.metastore.authorization.auth.reads=true
hive.metastore.sasl.enabled=true
hive.server2.authentication=KERBEROS
hive.server2.thrift.sasl.qop=auth-conf
hive.warehouse.subdir.inherit.perms=false
...

hdfs-site.xml:
dfs.block.access.token.enable=true
fs.permissions.umask-mode=027
...

core-site.xml:
hadoop.security.authentication=kerberos
hadoop.security.authorization=true
hadoop.proxyuser.hive.hosts=localhost,master
hadoop.proxyuser.hive.groups=*
...
{code}

When I create a database or a table using Kerberos authorised (kinit) user account and beeline (shell) the HDFS directories created by Hive are owned by 'hive' user and group is same as for parent directory ('data' in my case) ('hive' user does not even belong to that group at all but it is in supergroup).

Now when I try to load the data (or do any other map-reduce) the table files end up owned as the kinit'ed user and the actual user running Yarn container is the kinit'ed user (not 'hive').

This is causing a permission issues when I run queries that do map-reduce since I don't own the database and table directories.
Also this allows anybody to drop my database/table since this operation is performed as 'hive' user which is in the supergroup.

What I want to get is DDL queries to use kinit'ed user when accessing HDFS so database/table directories end up being owned as that user.

Is this a bug or configuration problem? 

Also the group should be users primary group (inherit.perms=false) and not group of the parent directory. This way I can use owner/group authorisation on HDFS to grant/restrict access using groups.

As it stands it is serious security issue and also renders the whole doAs/impersonation system useless for me.

Also see my question on Serverfault:
http://serverfault.com/questions/717483/hive-server2-not-impersonating-hdfs

Versions:
{code}
hadoop-0.20-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-client-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-hdfs-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-hdfs-namenode-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-mapreduce-historyserver-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-yarn-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hadoop-yarn-resourcemanager-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64
hive-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
hive-jdbc-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
hive-metastore-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
hive-server2-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)