You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Boris Shkolnik (JIRA)" <ji...@apache.org> on 2019/05/17 19:04:01 UTC

[jira] [Updated] (SAMZA-2165) Account for coordinator restarts in calls to status

     [ https://issues.apache.org/jira/browse/SAMZA-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Boris Shkolnik updated SAMZA-2165:
----------------------------------
    Fix Version/s: 1.2

> Account for coordinator restarts in calls to status
> ---------------------------------------------------
>
>                 Key: SAMZA-2165
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2165
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jagadish
>            Assignee: Jagadish
>            Priority: Major
>             Fix For: 1.2
>
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently status of a Samza job is determined by a combination of:
> 1. Obtaining YARN's status for the job by querying the RM
> 2. Obtain the AM/coordinator URL for the job
> 3. If (1) is "Running", Query the job's coordinator URL if all containers have started
> YARN may restart the coordinator between (2) and (3) and the old coordinator process may no longer be alive, triggering a ConnectException in (3). This causes the status-call to fail; 
> A better alternative to handle these retriable errors is to return a "New" status from the API - so that applications can keep polling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)