You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Unai P. Mendizabal (JIRA)" <ji...@apache.org> on 2018/01/11 15:52:00 UTC
[jira] [Created] (MESOS-8435) Marathon pods with endpoints just fail

Unai P. Mendizabal created MESOS-8435:
-----------------------------------------

             Summary: Marathon pods with endpoints just fail
                 Key: MESOS-8435
                 URL: https://issues.apache.org/jira/browse/MESOS-8435
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.4.0
         Environment: DC/OS 1.10.2, Marathon 1.5.2, Mesos 1.4.0, CentOS 7
            Reporter: Unai P. Mendizabal
         Attachments: bundle-2018-01-10T14_28_35-395418905.zip

Hi!

I originally posted a ticket on DC/OS's Jira, but I've been redirected because I've been told that this might be a Mesos issue. The original ticket is [here|https://jira.mesosphere.com/browse/MARATHON-8010]. I copy my original issue below:

"

I'm trying to launch a pod with the following JSON configuration:

 


{code:java}
{
  "id": "/druid",
  "version": "2018-01-10T09:32:26.109Z",
  "environment": {
    "key": "value"
  },
  "containers": [
    {
      "name": "broker",
      "resources": {
        "cpus": 0.5,
        "mem": 5120,
        "disk": 0
      },
      "image": {
        "kind": "DOCKER",
        "id": "my_image"
      },
      "healthCheck": {
        "http": {
          "scheme": "HTTP",
          "endpoint": "broker",
          "path": "/status"
        },
        "gracePeriodSeconds": 300,
        "intervalSeconds": 60,
        "maxConsecutiveFailures": 3,
        "timeoutSeconds": 20,
        "delaySeconds": 15
      },
      "endpoints": [
        {
          "name": "broker",
          "containerPort": 8082,
          "hostPort": 0,
          "protocol": [
            "tcp"
          ],
          "labels": {
            "VIP_0": "/druid:8082"
          }
        }
      ],
      "volumeMounts": [
        {
          "name": "hdfs",
          "mountPath": "/etc/hadoop/conf"
        }
      ]
    }
  ],
  "volumes": [
    {
      "name": "hdfs",
      "host": "/etc/hadoop/conf"
    }
  ],
  "networks": [
    {
      "name": "dcos",
      "mode": "container"
    }
  ],
  "scaling": {
    "instances": 1,
    "kind": "fixed"
  },
  "scheduling": {
    "placement": {
      "constraints": []
    }
  },
  "executorResources": {
    "cpus": 0.1,
    "mem": 32,
    "disk": 10
  },
  "fetch": []
}
{code}


The pod just fails and no log is generated. If I try yo check the logs of an instance, I get the message "Cannot Connect With The Server. You can also join us on our Slack channel or send us an email at help@dcos.io".

The pod, as for now, only has one container because I'm experimenting with the concept. The configuration JSON file creates a working pod if I just erase the parts about the endpoints, so it's not a problem of the image or any other part of the configuration.

So, is it a bug, an installation problem, a configuration problem...?

"

I was asked for a diagnostics bundle, and I have attached it with this ticket. The error that the person that answered to me found in the logs says:

{noformat}
2018-01-10 09:11:10: W0110 09:11:10.392127 1778 state.cpp:560] Failed to find 'libprocess.pid' or 'http.marker' for container 24c7ea02-c69c-45a2-b453-c89aac73b9bd of executor 'instance-druid.d065fcc4-f546-11e7-a797-3ec4d96a1657' of framework 0c292642-2815-4a1b-9d51-6fcce7fd88ce-0001
{noformat}

{noformat}
2018-01-10 10:45:51: E0110 10:45:51.702527 1774 slave.cpp:5292] Container 'e6cbcf76-a930-47b5-91d5-abe355d6fe76' for executor 'instance-druid.0c821274-f5eb-11e7-b2c6-d2a482b9a500' of framework 0c292642-2815-4a1b-9d51-6fcce7fd88ce-0001 failed to start: Collect failed: Failed to setup hostname and network files: Failed to bring up the loopback interface in the new network namespace of pid 4602: Success
{noformat}

I don't know if this issue fits here, but any help will be much appreciated.

Thanks beforehand, bye!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)