You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@reef.apache.org by "Markus Weimer (JIRA)" <ji...@apache.org> on 2015/08/29 01:33:45 UTC

[jira] [Commented] (REEF-687) Restructure handler for DriverRestart in Java

    [ https://issues.apache.org/jira/browse/REEF-687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720780#comment-14720780 ] 

Markus Weimer commented on REEF-687:
------------------------------------

{quote}
When EvaluatorFailedHandler is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of AllocatedEvaluators.
{quote}

We could add special handlers for these and have the app bind them via {{DriverRestartConfiguration}}, right? That way, we'd be consistent with the other handlers.



> Restructure handler for DriverRestart in Java
> ---------------------------------------------
>
>                 Key: REEF-687
>                 URL: https://issues.apache.org/jira/browse/REEF-687
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF Driver, REEF.NET Driver
>            Reporter: Andrew Chung
>            Assignee: Andrew Chung
>
> Currently, {{DriverRestartHandler}} only informs about the {{StartTime}}, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
>   # When {{EvaluatorFailedHandler}} is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of {{AllocatedEvaluators}}.
>   # When {{DriverRestartCompletedHandler}} is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
> The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the {{DriverRestartHandler}}, along with {{StartTime}}. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
> There will also be items to make 1 and 2 more explicit.
> For 1, an item will be filed to create a special {{DriverRestartEvaluatorFailedHandler}}. For 2, an item will be filed to extend the {{DriverRestartCompletedHandler}} to pass back an object that includes the set of failed Evaluator IDs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)