You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Chao Sun (JIRA)" <ji...@apache.org> on 2015/04/22 02:03:18 UTC

[jira] [Created] (HIVE-10433) Cancel connection when remote driver process exited with error code [Spark Branch]

Chao Sun created HIVE-10433:
-------------------------------

             Summary: Cancel connection when remote driver process exited with error code [Spark Branch]
                 Key: HIVE-10433
                 URL: https://issues.apache.org/jira/browse/HIVE-10433
             Project: Hive
          Issue Type: Bug
          Components: spark-branch
            Reporter: Chao Sun


Currently in HoS, after starting a remote process in SparkClientImpl, it will wait for the process to connect back. However, there are cases that the process may fail and exit with error code, and thus no connection is attempted. In this situation, the HS2 process will still wait for the connection and eventually timeout itself. What makes it worse, user may need to wait for two timeout periods, one for SparkSetReducerParallelism, and another for the actual Spark job.

We should cancel the timeout task and mark the promise as failed once we know that the process is failed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)