You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Erik Weathers (JIRA)" <ji...@apache.org> on 2016/02/22 22:46:18 UTC

[jira] [Updated] (MESOS-4737) document TaskID uniqueness requirement

     [ https://issues.apache.org/jira/browse/MESOS-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Weathers updated MESOS-4737:
---------------------------------
    Description: 
There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66] which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will only ever be 1 such TaskID running at the same time.

{code: title=existing comments for TaskID}
 * A framework generated ID to distinguish a task. The ID must remain
 * unique while the task is active. However, a framework can reuse an
 * ID _only_ if a previous task with the same ID has reached a
 * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
{code}

However, there are a few scenarios where problems can arise.

# The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that reuse an ID and get assigned to the same executor.
#* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E] for more info, as well as the attachment on this issue.
# Issues during network partitions and master failover, where a TaskID might appear to be unique in the system, whereas in actuality another Task is running with that ID and was just partitioned away for some time.

In light of these issues, we should simply update the document(s) to make it abundantly clear that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID creation should be updated.

  was:
There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66] which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will only ever be 1 such TaskID running at the same time.

{code title=existing comments for TaskID}
 * A framework generated ID to distinguish a task. The ID must remain
 * unique while the task is active. However, a framework can reuse an
 * ID _only_ if a previous task with the same ID has reached a
 * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
{code}

However, there are a few scenarios where problems can arise.

# The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that reuse an ID and get assigned to the same executor.
#* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E] for more info, as well as the attachment on this issue.
# Issues during network partitions and master failover, where a TaskID might appear to be unique in the system, whereas in actuality another Task is running with that ID and was just partitioned away for some time.

In light of these issues, we should simply update the document(s) to make it abundantly clear that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID creation should be updated.


> document TaskID uniqueness requirement
> --------------------------------------
>
>                 Key: MESOS-4737
>                 URL: https://issues.apache.org/jira/browse/MESOS-4737
>             Project: Mesos
>          Issue Type: Task
>          Components: documentation
>    Affects Versions: 0.27.0
>            Reporter: Erik Weathers
>            Assignee: Erik Weathers
>            Priority: Minor
>              Labels: documentation
>
> There are comments above the definition of TaskID in [mesos.proto|https://github.com/apache/mesos/blob/0.27.0/include/mesos/mesos.proto#L63-L66] which lead one to believe it is ok to reuse TaskID values so long as you guarantee there will only ever be 1 such TaskID running at the same time.
> {code: title=existing comments for TaskID}
>  * A framework generated ID to distinguish a task. The ID must remain
>  * unique while the task is active. However, a framework can reuse an
>  * ID _only_ if a previous task with the same ID has reached a
>  * terminal state (e.g., TASK_FINISHED, TASK_LOST, TASK_KILLED, etc.).
> {code}
> However, there are a few scenarios where problems can arise.
> # The checkpointing-and-recovery feature of mesos-slave/agent clashes with tasks that reuse an ID and get assigned to the same executor.
> #* See [this email|https://mail-archives.apache.org/mod_mbox/mesos-user/201602.mbox/%3CCAO5KYW8%2BXMWc1dXtEo20BAsfGow028jwjL2ubMinP%2BK%2BvdOh8w%40mail.gmail.com%3E] for more info, as well as the attachment on this issue.
> # Issues during network partitions and master failover, where a TaskID might appear to be unique in the system, whereas in actuality another Task is running with that ID and was just partitioned away for some time.
> In light of these issues, we should simply update the document(s) to make it abundantly clear that reusing TaskIDs is never ok.  At the minimum this should involve updating the afore-mentioned comments in {{mesos.proto}}.  Also any framework development guides that talk about TaskID creation should be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)