You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Neil Conway (JIRA)" <ji...@apache.org> on 2016/08/01 09:52:20 UTC

[jira] [Updated] (MESOS-4050) Change task reconciliation not omit unknown tasks

     [ https://issues.apache.org/jira/browse/MESOS-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neil Conway updated MESOS-4050:
-------------------------------
    Description: 
If the master fails over and a framework tries to do an explicit reconciliation for a task running on an agent that has not reregistered yet (and {{agent_reregister_timeout}} has not been exceeded), the master will _not_ send a reconciliation response for that task.

This is confusing for framework authors. It seems better for the master to announce all the information it has explicitly: e.g., to return "task X is in an unknown state", rather than not returning anything. Then as more information arrives (e.g., agent reregisters or task definitively dies), task state would transition appropriately. We might want to do this via a new task state, e.g., {{TASK_REREGISTER_PENDING}}.

This might be consistent with changing the task states so that we capture "task is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or {{TASK_WANDERING}}) -- see MESOS-4049.

  was:
If a framework tries to reconcile the state of a task that is in an unknown state (because the agent running the task is partitioned from the master), the master will _not_ include any information about that task.

This is confusing for framework authors. It seems better for the master to announce all the information it has explicitly: e.g., to return "task X is in an unknown state", rather than not returning anything. Then as more information arrives (e.g., task returns or task definitively dies), task state would transition appropriately.

This might be consistent with changing the task states so that we capture "task is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or {{TASK_WANDERING}}) -- see MESOS-4049.


> Change task reconciliation not omit unknown tasks
> -------------------------------------------------
>
>                 Key: MESOS-4050
>                 URL: https://issues.apache.org/jira/browse/MESOS-4050
>             Project: Mesos
>          Issue Type: Improvement
>          Components: framework, master
>            Reporter: Neil Conway
>              Labels: mesosphere, reconciliation
>
> If the master fails over and a framework tries to do an explicit reconciliation for a task running on an agent that has not reregistered yet (and {{agent_reregister_timeout}} has not been exceeded), the master will _not_ send a reconciliation response for that task.
> This is confusing for framework authors. It seems better for the master to announce all the information it has explicitly: e.g., to return "task X is in an unknown state", rather than not returning anything. Then as more information arrives (e.g., agent reregisters or task definitively dies), task state would transition appropriately. We might want to do this via a new task state, e.g., {{TASK_REREGISTER_PENDING}}.
> This might be consistent with changing the task states so that we capture "task is partitioned" as an explicit task state ({{TASK_UNKNOWN}} or {{TASK_WANDERING}}) -- see MESOS-4049.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)