You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Hai Lu (Jira)" <ji...@apache.org> on 2019/11/07 00:06:03 UTC
[jira] [Updated] (SAMZA-2248) Fix AM bookkeeping on receiving dead
container notifications
[ https://issues.apache.org/jira/browse/SAMZA-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hai Lu updated SAMZA-2248:
--------------------------
Fix Version/s: 1.3
> Fix AM bookkeeping on receiving dead container notifications
> ------------------------------------------------------------
>
> Key: SAMZA-2248
> URL: https://issues.apache.org/jira/browse/SAMZA-2248
> Project: Samza
> Issue Type: Bug
> Reporter: Xinyu Liu
> Assignee: Xinyu Liu
> Priority: Major
> Fix For: 1.3
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Issue tldr:
> 1. AM gets extra containers from the RM which it saves for later use
> 2. When the container that we saved in step1 dies, the AM on receiving this callback does nothing about it.
> 3. Later, when we are looking for a container to use - we pick up the dead container that we saved and did not clean up (step 1&2) and launch a container.
> 4. Now, if this launched container ever dies - the RM will never notify the AM about it since it see's it as a duplicate (step 2)
> 5. Job is left without the container rescheduled and will need to be restarted.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)