You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "wuchang (JIRA)" <ji...@apache.org> on 2017/09/13 12:15:01 UTC

[jira] [Comment Edited] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

    [ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164559#comment-16164559 ] 

wuchang edited comment on SPARK-11248 at 9/13/17 12:14 PM:
-----------------------------------------------------------

the spark thrift server have security bugs , cause the result that user A sometimes have the authority of user B and User B sometimes have the authority of user A in turn. I debugged it and I find that it is caused by the hive 1.2.1 library , OrcInputFormat.java, in which a thread pool is created to contact with remote HDFS. Since threads in pool is reused and shared, so , when thread-1-pool-1 is used by user A previously and after that user B is assigned to this thread in coincidence, then user B will have the security context of User A.

I have fixed this bug by add UserGroupInformation in this pool, to make sure that when a user is assigned a thread, then the security is switched to this user at the same time.

See the pull request: [https://github.com/JoshRosen/hive/pull/1]


was (Author: wuchang1989):
the spark thrift server have security bugs , cause the result that user A sometimes have the authority of user B and User B sometimes have the authority of user B. I debugged it and I find that it is caused by the hive 1.2.1 library , OrcInputFormat.java, in which a thread pool is created to contact with remote HDFS. Since threads in pool is reused and shared, so , when thread-1-pool-1 is used by user A previously and after that user B is assigned to this thread in coincidence, then user B will have the security context of User A.

I have fixed this bug by add UserGroupInformation in this pool, to make sure that when a user is assigned a thread, then the security is switched to this user at the same time.

See the pull request: [https://github.com/JoshRosen/hive/pull/1]

> Spark hivethriftserver is using the wrong user to while getting HDFS permissions
> --------------------------------------------------------------------------------
>
>                 Key: SPARK-11248
>                 URL: https://issues.apache.org/jira/browse/SPARK-11248
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.5.0, 1.5.1, 2.1.1, 2.2.0
>            Reporter: Trystan Leftwich
>
> While running spark as a hivethrift-server via Yarn Spark will use the user running the Hivethrift server rather than the user connecting via JDBC to check HDFS perms.
> i.e.
> In HDFS the perms are
> rwx------   3 testuser testuser /user/testuser/table/testtable
> And i connect via beeline as user testuser
> beeline -u 'jdbc:hive2://localhost:10511' -n 'testuser' -p ''
> If i try to hit that table
> select count(*) from test_table;
> I get the following error
> Error: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table test_table. java.security.AccessControlException: Permission denied: user=hive, access=READ, inode="/user/testuser/table/testtable":testuser:testuser:drwxr-x--x
> 	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
> 	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
> 	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:185)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6795)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6777)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6702)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:9529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:1516)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1433)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) (state=,code=0)
> I have the following in set in hive-site.xml so it should be using the correct user.
> <property>
>       <name>hive.server2.enable.doAs</name>
>       <value>true</value>
>     </property>
>     <property>
>       <name>hive.metastore.execute.setugi</name>
>       <value>true</value>
>     </property>
>     
> This works correctly in hive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org