You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2018/06/13 14:10:00 UTC

[jira] [Commented] (HIVE-18916) SparkClientImpl doesn't error out if spark-submit fails

    [ https://issues.apache.org/jira/browse/HIVE-18916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511191#comment-16511191 ] 

Sahil Takiar commented on HIVE-18916:
-------------------------------------

[~aihuaxu] can you take a look. Here is a brief summary of the changes:
* {{SparkClientImpl}} has been modified so that if the thread that is monitoring the {{bin/spark-submit}} process detects that {{bin/spark-submit}} fails, it parses the stdout / stderr of {{bin/spark-submit}} and checks for any log lines that contain "Error" and then includes those lines in the exception that gets thrown
** {{SparkClientImpl}} was actually already doing this, but the information wasn't getting propagated all the way to the end user
* A few changes to {{RpcServer}} were necessary to make sure the exception thrown by the "Driver" thread gets propagated to the user
* A few other minor changes to classes like {{RemoteSparkJobMonitor}}, {{SparkTask}} and the constructor of {{SparkClientImpl}} to prevent double logging of exceptions
* Added a few unit tests for this, which required masking a few additional patterns in .q files

The motivation for this is that {{bin/spark-submit}} errors out when certain parameters are misconfigured. This allows HoS to propagate these error messages to the end-user, which should improve debuggability. The added .q files are a good example of this.

> SparkClientImpl doesn't error out if spark-submit fails
> -------------------------------------------------------
>
>                 Key: HIVE-18916
>                 URL: https://issues.apache.org/jira/browse/HIVE-18916
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-18916.1.WIP.patch, HIVE-18916.2.patch, HIVE-18916.3.patch
>
>
> If {{spark-submit}} returns a non-zero exit code, {{SparkClientImpl}} will simply log the exit code, but won't throw an error. Eventually, the connection timeout will get triggered and an exception like {{Timed out waiting for client connection}} will be logged, which is pretty misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)