You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Dmitry Fedorov (JIRA)" <ji...@apache.org> on 2016/05/24 19:06:12 UTC

[jira] [Commented] (MESOS-5294) Status updates after a health check are incomplete or invalid

    [ https://issues.apache.org/jira/browse/MESOS-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298731#comment-15298731 ] 

Dmitry Fedorov commented on MESOS-5294:
---------------------------------------

I've met the same issue, an issue occurs when we are using custom networking in docker, as [~thegner] described.

I've manage to "fix" it by duplicating 

{code}
164     inspect = docker->inspect(containerName, DOCKER_INSPECT_DELAY)
165       .then(defer(self(), [=](const Docker::Container& container) {
166         if (!killed) {
167           TaskStatus status;
168           status.mutable_task_id()->CopyFrom(taskId.get());
169           status.set_state(TASK_RUNNING);
170           status.set_data(container.output);
171           if (container.ipAddress.isSome()) {
172             // TODO(karya): Deprecated -- Remove after 0.25.0 has shipped.
173             Label* label = status.mutable_labels()->add_labels();
174             label->set_key("Docker.NetworkSettings.IPAddress");
175             label->set_value(container.ipAddress.get());
176
177             NetworkInfo* networkInfo =
178               status.mutable_container_status()->add_network_infos();
179
180             // TODO(CD): Deprecated -- Remove after 0.27.0.
181             networkInfo->set_ip_address(container.ipAddress.get());
182
183             NetworkInfo::IPAddress* ipAddress =
184               networkInfo->add_ip_addresses();
185             ipAddress->set_ip_address(container.ipAddress.get());
186           }
{code}
from `launchTask` method in src/docker/executor.cpp to taskHealthUpdated method

> Status updates after a health check are incomplete or invalid
> -------------------------------------------------------------
>
>                 Key: MESOS-5294
>                 URL: https://issues.apache.org/jira/browse/MESOS-5294
>             Project: Mesos
>          Issue Type: Bug
>         Environment: mesos 0.28.0, docker 1.11, marathon 0.15.3, mesos-dns, ubuntu 14.04
>            Reporter: Travis Hegner
>            Assignee: Travis Hegner
>
> With command health checks enabled via marathon, mesos-dns will resolve the task correctly until the task is reported as "healthy". At that point, mesos-dns stops resolving the task correctly.
> -Digging through src/docker/executor.cpp, I found that in the {{taskHealthUpdated()}} function is attempting to copy the taskID to the new status instance with-
> {code}status.mutable_task_id()->CopyFrom(taskID);{code}
> -but other instances of status updates have a similar line-
> {code}status.mutable_task_id()->CopyFrom(taskID.get());{code}
> -My assumption is that this difference is causing the status update after a health check to not have a proper taskID, which in turn is causing an incorrect state.json output.-
> -I'll try to get a patch together soon.-
> UPDATE:
> None of the above assumption are correct. Something else is causing the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)