You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2016/06/30 15:19:10 UTC

[jira] [Commented] (TEZ-3318) Tez UI: Polling is not restarted after RM recovery

    [ https://issues.apache.org/jira/browse/TEZ-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357256#comment-15357256 ] 

Hitesh Shah commented on TEZ-3318:
----------------------------------

I think there should be limit to how many continuous re-tries are done. Maybe say 10 mins in total at the very max? i.e. if polling every 10 seconds, max retries should be for 60 times? This counter should obviously be reset to 0 on the first successful call. 

> Tez UI: Polling is not restarted after RM recovery
> --------------------------------------------------
>
>                 Key: TEZ-3318
>                 URL: https://issues.apache.org/jira/browse/TEZ-3318
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Sreenath Somarajapuram
>            Assignee: Sreenath Somarajapuram
>
> For a running DAG, we poll the AM to get progress and other realtime information. This communication happens via RM. If RM goes down, even after its recovery the polling is not re established.
> Step to repro:
> 1. Run a job
> 2. Go to DAG details page, and ensure that the progress is getting updated.
> 3. Stop RM, and ensure that error bar is getting displayed in the UI.
> 4. Start RM.
> 5. As soon as RM is online, the progress bar must get updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)