You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "haosdent (JIRA)" <ji...@apache.org> on 2016/07/04 05:45:10 UTC
[jira] [Commented] (MESOS-5294) Status updates after a health check
are incomplete or invalid
[ https://issues.apache.org/jira/browse/MESOS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360864#comment-15360864 ]
haosdent commented on MESOS-5294:
---------------------------------
A bit wired here, the {{TaskStatus}} sent by executor should not contain {{container_status}}
{code}
TaskStatus status;
status.mutable_task_id()->CopyFrom(taskID);
status.set_healthy(healthy);
status.set_state(TASK_RUNNING);
driver.get()->sendStatusUpdate(status);
{code}
[~travis.hegner] Are you still working on this ticket? May you mind change the assignee to me?
> Status updates after a health check are incomplete or invalid
> -------------------------------------------------------------
>
> Key: MESOS-5294
> URL: https://issues.apache.org/jira/browse/MESOS-5294
> Project: Mesos
> Issue Type: Bug
> Environment: mesos 0.28.0, docker 1.11, marathon 0.15.3, mesos-dns, ubuntu 14.04
> Reporter: Travis Hegner
> Assignee: Travis Hegner
>
> With command health checks enabled via marathon, mesos-dns will resolve the task correctly until the task is reported as "healthy". At that point, mesos-dns stops resolving the task correctly.
> -Digging through src/docker/executor.cpp, I found that in the {{taskHealthUpdated()}} function is attempting to copy the taskID to the new status instance with-
> {code}status.mutable_task_id()->CopyFrom(taskID);{code}
> -but other instances of status updates have a similar line-
> {code}status.mutable_task_id()->CopyFrom(taskID.get());{code}
> -My assumption is that this difference is causing the status update after a health check to not have a proper taskID, which in turn is causing an incorrect state.json output.-
> -I'll try to get a patch together soon.-
> UPDATE:
> None of the above assumption are correct. Something else is causing the issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)