You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Till Toenshoff (JIRA)" <ji...@apache.org> on 2014/05/06 02:40:15 UTC
[jira] [Comment Edited] (MESOS-1243) Containerizer::wait return
type should be Option
[ https://issues.apache.org/jira/browse/MESOS-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990141#comment-13990141 ]
Till Toenshoff edited comment on MESOS-1243 at 5/6/14 12:38 AM:
----------------------------------------------------------------
Recovery:
Right now {{recover}} is not container or executor specific, hence it shouldn't fail just because a single one wasn't recoverable for any reason.
Let me draft this from the ExternalContainerizer's point of view in a failure scenario;
Slave invokes {{launch}} and the EC tries to pass this on to the ECP. Now assume the slave dies prior to the ECP actually being able to launch anything. After a {{recover}} the slave now assumes that the ECP will be able to {{wait}} on that container. The ECP however never {{launch}} ed that container, hence it is unable to {{wait}}, thus is unable to return a {{Termination}}.
So the problem here has to be seen specifically minding that the ECP and the slave may have differing status.
The quick way out of this is to allow that {{Termination}} to be optional. Another way may be to make sure that the container is only checkpointed after a fully achieved launch?
was (Author: tillt):
Recovery:
Right now {{recover}} is not container or executor specific, hence it shouldn't fail just because a single one wasn't recoverable for any reason.
Let me draft this from the ExternalContainerizer's point of view in a failure scenario;
Slave invokes {{launch}} and the EC tries to pass this on to the ECP. Now assume the slave dies prior to the ECP actually being able to launch anything. After a {{recover}} the slave now assumes that the ECP will be able to {{wait}} on that container. The ECP however never {{launch}}ed that container, hence it is unable to {{wait}}, thus is unable to return a {{Termination}}.
So the problem here has to be seen specifically minding that the ECP and the slave may have differing status.
The quick way out of this is to allow that {{Termination}} to be optional. Another way may be to make sure that the container is only checkpointed after a fully achieved launch?
> Containerizer::wait return type should be Option<Termination>
> -------------------------------------------------------------
>
> Key: MESOS-1243
> URL: https://issues.apache.org/jira/browse/MESOS-1243
> Project: Mesos
> Issue Type: Improvement
> Reporter: Till Toenshoff
> Priority: Minor
> Labels: containerizer, external-containerizer, isolation, mesos, mesos-containerizer
>
> The containerizer {{wait}} should return an {{Option<Termination>}} to distinguish the case when it doesn't know about a {{ContainerID}}.
--
This message was sent by Atlassian JIRA
(v6.2#6252)