You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Andrew Chung (JIRA)" <ji...@apache.org> on 2015/08/29 01:25:45 UTC
[jira] [Updated] (REEF-687) Restructure handler for DriverRestart
in Java
[ https://issues.apache.org/jira/browse/REEF-687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Chung updated REEF-687:
------------------------------
Description:
Currently, {{DriverRestartHandler}} only informs about the {{StartTime}}, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
1. When {{EvaluatorFailedHandler}} is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of {{AllocatedEvaluators}}.
2. When {{DriverRestartCompletedHandler}} is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the {{DriverRestartHandler}}, along with {{StartTime}}. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
There will also be items to make 1 and 2 more explicit.
For 1, an item will be filed to create a special {{DriverRestartEvaluatorFailedHandler}}. For 2, an item will be filed to extend the {{DriverRestartCompletedHandler}} to pass back an object that includes the set of failed Evaluator IDs.
was:
Currently, {{DriverRestartHandler}} only informs about the {{StartTime``, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
1. When {{EvaluatorFailedHandler}} is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of {{AllocatedEvaluators}}.
2. When {{DriverRestartCompletedHandler}} is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the {{DriverRestartHandler}}, along with {{StartTime}}. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
There will also be items to make 1 and 2 more explicit.
For 1, an item will be filed to create a special {{DriverRestartEvaluatorFailedHandler}}. For 2, an item will be filed to extend the {{DriverRestartCompletedHandler}} to pass back an object that includes the set of failed Evaluator IDs.
> Restructure handler for DriverRestart in Java
> ---------------------------------------------
>
> Key: REEF-687
> URL: https://issues.apache.org/jira/browse/REEF-687
> Project: REEF
> Issue Type: Sub-task
> Components: REEF Driver, REEF.NET Driver
> Reporter: Andrew Chung
> Assignee: Andrew Chung
>
> Currently, {{DriverRestartHandler}} only informs about the {{StartTime}}, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
> 1. When {{EvaluatorFailedHandler}} is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of {{AllocatedEvaluators}}.
> 2. When {{DriverRestartCompletedHandler}} is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
> The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the {{DriverRestartHandler}}, along with {{StartTime}}. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
> There will also be items to make 1 and 2 more explicit.
> For 1, an item will be filed to create a special {{DriverRestartEvaluatorFailedHandler}}. For 2, an item will be filed to extend the {{DriverRestartCompletedHandler}} to pass back an object that includes the set of failed Evaluator IDs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)