You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Markus Weimer (JIRA)" <ji...@apache.org> on 2015/08/31 22:26:46 UTC
[jira] [Resolved] (REEF-687) Restructure handler for DriverRestart
in Java
[ https://issues.apache.org/jira/browse/REEF-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markus Weimer resolved REEF-687.
--------------------------------
Resolution: Fixed
Fix Version/s: 0.13
Resolved via [#438|https://github.com/apache/incubator-reef/pull/438]
> Restructure handler for DriverRestart in Java
> ---------------------------------------------
>
> Key: REEF-687
> URL: https://issues.apache.org/jira/browse/REEF-687
> Project: REEF
> Issue Type: Sub-task
> Components: REEF Driver, REEF.NET Driver
> Reporter: Andrew Chung
> Assignee: Andrew Chung
> Fix For: 0.13
>
>
> Currently, {{DriverRestartHandler}} only informs about the {{StartTime}}, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
> # When {{EvaluatorFailedHandler}} is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of {{AllocatedEvaluators}}.
> # When {{DriverRestartCompletedHandler}} is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
> The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the {{DriverRestartHandler}}, along with {{StartTime}}. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
> There will also be items to make 1 and 2 more explicit.
> For 1, an item will be filed to create a special {{DriverRestartEvaluatorFailedHandler}}. See REEF-688. For 2, an item will be filed to extend the {{DriverRestartCompletedHandler}} to pass back an object that includes the set of failed Evaluator IDs. See REEF-691.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)