You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2015/02/12 18:05:11 UTC

[jira] [Commented] (HIVE-9665) Parallel move task optimization causes race condition

    [ https://issues.apache.org/jira/browse/HIVE-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318551#comment-14318551 ] 

Thejas M Nair commented on HIVE-9665:
-------------------------------------

+1

There are two issues that we need to fix before MoveTask can be used in parallel.
 # A thread local Hive object should be used. Adding a "db = Hive.get(conf);" in MoveTask.execute will fix that
 # SessionState is examined by acid code in MoveTask, if it is in a different thread that thread will not have the SessionState object available. 

> Parallel move task optimization causes race condition
> -----------------------------------------------------
>
>                 Key: HIVE-9665
>                 URL: https://issues.apache.org/jira/browse/HIVE-9665
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gunther Hagleitner
>            Assignee: Gunther Hagleitner
>            Priority: Critical
>         Attachments: HIVE-9665.1.patch
>
>
> The change in HIVE-8042 doesn't actually work. Running it at scale produces race conditions which lead to broken thrift messages and OOMs. E.g.:
> {noformat}
> java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1122)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1108)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1091)
> 	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:131)
> 	at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
> 	at com.sun.proxy.$Proxy9.getTable(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1064)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1006)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:250)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1122)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1108)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1091)
> 	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:131)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
> 	at com.sun.proxy.$Proxy9.getTable(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1064)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1006)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:250)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:353)
> 	at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:215)
> 	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1122)
> 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1108)
> 	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1091)
> 	at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.getTable(SessionHiveMetaStoreClient.java:131)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90)
> 	at com.sun.proxy.$Proxy9.getTable(Unknown Source)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1064)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1006)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:250)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)