You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sahil Takiar (JIRA)" <ji...@apache.org> on 2018/06/07 20:11:00 UTC

[jira] [Updated] (HIVE-18684) Race condition in RemoteSparkJobMonitor

     [ https://issues.apache.org/jira/browse/HIVE-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sahil Takiar updated HIVE-18684:
--------------------------------
    Attachment: HIVE-18684.3.patch

> Race condition in RemoteSparkJobMonitor
> ---------------------------------------
>
>                 Key: HIVE-18684
>                 URL: https://issues.apache.org/jira/browse/HIVE-18684
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-18684.1.patch, HIVE-18684.2.patch, HIVE-18684.3.patch
>
>
> There is a race condition in {{RemoteSparkJobMonitor}}. Sometimes the info in {{RemoteSparkJobMonitor#startMonitor.STARTED}} gets printed out, sometimes it doesn't. This can be easily verified by running a qtest on {{TestMiniSparkOnYarnCliDriver}} and counting the number of times {{Query Hive on Spark job}} is printed vs. the number of times {{Finished successfully in}} gets printed.
> The issue is that {{RemoteSparkJobMonitor}} runs every one second, and checks the state of {{JobHandle}}. Depending on the state, it prints out some logging info. The content of the logs contain an implicit assumption that logs in the {{STARTED}} state are printed before the logs in the {{SUCCEEDED}} state. However, this isn't always the case. The state transitions are driven by how long the remote Spark job takes to run, and it it finishes within one second then the logs in the {{STARTED}} state never printed.
> This can be confusing to users, and there is key debugging information that is printed in the {{STARTED}} state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)