You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Unai P. Mendizabal (JIRA)" <ji...@apache.org> on 2018/01/11 15:52:00 UTC
[jira] [Created] (MESOS-8435) Marathon pods with endpoints just
fail
Unai P. Mendizabal created MESOS-8435:
-----------------------------------------
Summary: Marathon pods with endpoints just fail
Key: MESOS-8435
URL: https://issues.apache.org/jira/browse/MESOS-8435
Project: Mesos
Issue Type: Bug
Affects Versions: 1.4.0
Environment: DC/OS 1.10.2, Marathon 1.5.2, Mesos 1.4.0, CentOS 7
Reporter: Unai P. Mendizabal
Attachments: bundle-2018-01-10T14_28_35-395418905.zip
Hi!
I originally posted a ticket on DC/OS's Jira, but I've been redirected because I've been told that this might be a Mesos issue. The original ticket is [here|https://jira.mesosphere.com/browse/MARATHON-8010]. I copy my original issue below:
"
I'm trying to launch a pod with the following JSON configuration:
{code:java}
{
"id": "/druid",
"version": "2018-01-10T09:32:26.109Z",
"environment": {
"key": "value"
},
"containers": [
{
"name": "broker",
"resources": {
"cpus": 0.5,
"mem": 5120,
"disk": 0
},
"image": {
"kind": "DOCKER",
"id": "my_image"
},
"healthCheck": {
"http": {
"scheme": "HTTP",
"endpoint": "broker",
"path": "/status"
},
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"maxConsecutiveFailures": 3,
"timeoutSeconds": 20,
"delaySeconds": 15
},
"endpoints": [
{
"name": "broker",
"containerPort": 8082,
"hostPort": 0,
"protocol": [
"tcp"
],
"labels": {
"VIP_0": "/druid:8082"
}
}
],
"volumeMounts": [
{
"name": "hdfs",
"mountPath": "/etc/hadoop/conf"
}
]
}
],
"volumes": [
{
"name": "hdfs",
"host": "/etc/hadoop/conf"
}
],
"networks": [
{
"name": "dcos",
"mode": "container"
}
],
"scaling": {
"instances": 1,
"kind": "fixed"
},
"scheduling": {
"placement": {
"constraints": []
}
},
"executorResources": {
"cpus": 0.1,
"mem": 32,
"disk": 10
},
"fetch": []
}
{code}
The pod just fails and no log is generated. If I try yo check the logs of an instance, I get the message "Cannot Connect With The Server. You can also join us on our Slack channel or send us an email at help@dcos.io".
The pod, as for now, only has one container because I'm experimenting with the concept. The configuration JSON file creates a working pod if I just erase the parts about the endpoints, so it's not a problem of the image or any other part of the configuration.
So, is it a bug, an installation problem, a configuration problem...?
"
I was asked for a diagnostics bundle, and I have attached it with this ticket. The error that the person that answered to me found in the logs says:
{noformat}
2018-01-10 09:11:10: W0110 09:11:10.392127 1778 state.cpp:560] Failed to find 'libprocess.pid' or 'http.marker' for container 24c7ea02-c69c-45a2-b453-c89aac73b9bd of executor 'instance-druid.d065fcc4-f546-11e7-a797-3ec4d96a1657' of framework 0c292642-2815-4a1b-9d51-6fcce7fd88ce-0001
{noformat}
{noformat}
2018-01-10 10:45:51: E0110 10:45:51.702527 1774 slave.cpp:5292] Container 'e6cbcf76-a930-47b5-91d5-abe355d6fe76' for executor 'instance-druid.0c821274-f5eb-11e7-b2c6-d2a482b9a500' of framework 0c292642-2815-4a1b-9d51-6fcce7fd88ce-0001 failed to start: Collect failed: Failed to setup hostname and network files: Failed to bring up the loopback interface in the new network namespace of pid 4602: Success
{noformat}
I don't know if this issue fits here, but any help will be much appreciated.
Thanks beforehand, bye!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)