You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2015/04/20 20:42:00 UTC

[jira] [Created] (HIVE-10404) hive.exec.parallel=true causes "out of sequence response" and SocketTimeoutException: Read timed out

Eugene Koifman created HIVE-10404:
-------------------------------------

             Summary: hive.exec.parallel=true causes "out of sequence response" and SocketTimeoutException: Read timed out
                 Key: HIVE-10404
                 URL: https://issues.apache.org/jira/browse/HIVE-10404
             Project: Hive
          Issue Type: Bug
          Components: Query Processor
            Reporter: Eugene Koifman


With hive.exec.parallel=true, Driver.lauchTask() calls Task.initialize() from 1 thread on several Tasks.  It then starts new threads to run those tasks.
Taks.initiazlie() gets an instance of Hive and holds on to it.  Hive.java internally uses ThreadLocal to hand out instances, but since Task.initialize() is called by a single thread from the Driver multiple tasks share an instance of Hive.

Each Hive instances has a single instance of MetaStoreClient; the later is not thread safe.

With hive.exec.parallel=true, different threads actually execute the tasks, different threads end up sharing the same MetaStoreClient.

If you make 2 concurrent calls, for example Hive.getTable(String), the Thrift responses may return to the wrong caller.
Thus the first caller gets "out of sequence response", drops this message and reconnects.  If the timing is right, it will consume the other's response, but the the other caller will block for hive.metastore.client.socket.timeout since its response message has now been lost.

This is just one concrete example.

One possible fix is to make Task.db use ThreadLocal.

This could be related to HIVE-6893



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)