You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by Xiaodong Zhang <xd...@alauda.io> on 2017/02/20 17:13:30 UTC
Re: Running mesos-slave in the docker that leave many zombie process
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
Re: Running mesos-slave in the docker that leave many zombie process
Posted by haosdent <ha...@gmail.com>.
Sorry for a typo.
s/those containers they don't wait/those containers it could not recovered/.
On Fri, Feb 24, 2017 at 12:52 AM, haosdent <ha...@gmail.com> wrote:
> Hi, @xiaodong
>
> >If I restart mesos-slave. Then the zombie container are gone.
>
> Yep, during recovering stage, mesos-agent would remove those containers
> they don't wait
>
> > Remove executor
>
> What you mean about "remove executor", do you mean kill
> `mesos-docker-executor`?
> If you mean this, it is an expected behavior. Because the docker container
> is reaped by mesos-docker-executor, if you kill it.
> Mesos agent isn't aware of the status of docker container (the status of
> the docker container is checked by mesos-docker-executor and send task
> statuses to the mesos agent ).
>
>
> On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io>
> wrote:
>
>> What can I do for this. Do you guys need more info?
>>
>> 发件人: Xiaodong Zhang <xd...@alauda.io>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2017年2月21日 星期二 下午6:18
>> 至: "user@mesos.apache.org" <us...@mesos.apache.org>
>>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> Hi @Haosdent thanks for your reply.
>> I tried 1.0.3, 1.1.0. They both have the same problem.
>>
>> 1. Create container.
>>
>>
>> 2. Restart container. Works well.
>>
>> 3. Remove executor
>>
>>
>> If I restart mesos-slave. Then the zombie container gone.
>>
>> Any thoughts?
>>
>> Thanks,
>> Xiaodong
>>
>> 发件人: haosdent <ha...@gmail.com>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2017年2月21日 星期二 上午1:19
>> 至: user <us...@mesos.apache.org>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> Hi, @xiaodong May you try if this problem still exists after 1.0? I
>> remember Mesos change the recovery for docker containers to avoid this
>> after 1.0.
>>
>> On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>
>> wrote:
>>
>>> Hi guys.
>>>
>>> I try to fix zombie container as this email. It works well when I
>>> restart mesos-slave. No zombie containers occur. But this just works on
>>> restarting mesos-slave.
>>>
>>> If I restart the executor, the executor will quit, and the container
>>> which executor start, will be a zombie container.
>>>
>>> Any idea about this? My mesos version is 0.28.
>>>
>>> Here is some pic:
>>>
>>>
>>> 1. Start a container.
>>>
>>>
>>> 2. Restart mesos-slave. Everything is ok.
>>>
>>> 3. Kill the executor container, zombie container occur.
>>>
>>>
>>> How can I fix this?
>>>
>>> Thanks,
>>> Xiaodong
>>>
>>> 发件人: tommy xiao <xi...@gmail.com>
>>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>>> 日期: 2016年11月22日 星期二 上午12:32
>>> 至: user <us...@mesos.apache.org>
>>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>>
>>> you need it --pid=host
>>>
>>> 2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>:
>>>
>>>> Thanks @haosdent, let me try it.
>>>>
>>>> 2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>:
>>>>
>>>>> Pass the `--pid=host` flag when starting the docker container may
>>>>> resolve this.
>>>>> >start the mesos_slave container with "--pid=host" so that it uses
>>>>> the process namespace of the host.
>>>>>
>>>>> On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com> wrote:
>>>>>
>>>>>> No sure if it related to this issue https://github.com/mesos
>>>>>> phere/docker-containers/issues/9
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I meet a problem when running mesos-slave in the docker. Here are
>>>>>>> some zombie process in this way.
>>>>>>>
>>>>>>> ```
>>>>>>> root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> ```
>>>>>>>
>>>>>>> And I find the zombies come from mesos-slave process:
>>>>>>>
>>>>>>> ```
>>>>>>> pstree -p -s 10547
>>>>>>> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
>>>>>>> ```
>>>>>>>
>>>>>>> The logs has been deleted by the cron job a few weeks ago, but I
>>>>>>> remember so many `Failed to shutdown socket with fd xx: Transport
>>>>>>> endpoint is not connected` in the log.
>>>>>>>
>>>>>>> I report this to the JIRA: https://issues.apache.org/jira
>>>>>>> /browse/MESOS-6615
>>>>>>>
>>>>>>> Is there anyone saw this issue before ?
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Haosdent Huang
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Deshi Xiao
>>> Twitter: xds2000
>>> E-mail: xiaods(AT)gmail.com
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by haosdent <ha...@gmail.com>.
>If mesas-docker-executor exit itself, what does mesos-slave do?
I just tried(kill the executor), and I get a TASK_LOST.
On Fri, Feb 24, 2017 at 10:27 AM, haosdent <ha...@gmail.com> wrote:
> >Mesos still can’t prevent the zombie container occurs, Right
>
> I don' think so. Usually mesos-docker-executor would not exit while the
> corresponding container running unless you kill it. If it exits unexpected,
> it should be a bug.
>
> On Fri, Feb 24, 2017 at 10:17 AM, Xiaodong Zhang <xd...@alauda.io>
> wrote:
>
>> Hi Haosdent,
>>
>> Thanks for your reply.
>>
>> What you mean about "remove executor", do you mean kill
>> `mesos-docker-executor`?
>> If you mean this, it is an expected behavior. Because the docker
>> container is reaped by mesos-docker-executor, if you kill it.
>> Mesos agent isn't aware of the status of docker container (the status of
>> the docker container is checked by mesos-docker-executor and send task
>> statuses to the mesos agent ).
>>
>> If this is an expected behavior. Mesos still can’t prevent the zombie
>> container occurs, Right? If mesas-docker-executor exit itself, what does
>> mesos-slave do?
>>
>> 发件人: haosdent <ha...@gmail.com>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2017年2月24日 星期五 上午12:52
>>
>> 至: user <us...@mesos.apache.org>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> Hi, @xiaodong
>>
>> >If I restart mesos-slave. Then the zombie container are gone.
>>
>> Yep, during recovering stage, mesos-agent would remove those containers
>> they don't wait
>>
>> > Remove executor
>>
>> What you mean about "remove executor", do you mean kill
>> `mesos-docker-executor`?
>> If you mean this, it is an expected behavior. Because the docker
>> container is reaped by mesos-docker-executor, if you kill it.
>> Mesos agent isn't aware of the status of docker container (the status of
>> the docker container is checked by mesos-docker-executor and send task
>> statuses to the mesos agent ).
>>
>>
>> On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io>
>> wrote:
>>
>>> What can I do for this. Do you guys need more info?
>>>
>>> 发件人: Xiaodong Zhang <xd...@alauda.io>
>>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>>> 日期: 2017年2月21日 星期二 下午6:18
>>> 至: "user@mesos.apache.org" <us...@mesos.apache.org>
>>>
>>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>>
>>> Hi @Haosdent thanks for your reply.
>>> I tried 1.0.3, 1.1.0. They both have the same problem.
>>>
>>> 1. Create container.
>>>
>>>
>>> 2. Restart container. Works well.
>>>
>>> 3. Remove executor
>>>
>>>
>>> If I restart mesos-slave. Then the zombie container gone.
>>>
>>> Any thoughts?
>>>
>>> Thanks,
>>> Xiaodong
>>>
>>> 发件人: haosdent <ha...@gmail.com>
>>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>>> 日期: 2017年2月21日 星期二 上午1:19
>>> 至: user <us...@mesos.apache.org>
>>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>>
>>> Hi, @xiaodong May you try if this problem still exists after 1.0? I
>>> remember Mesos change the recovery for docker containers to avoid this
>>> after 1.0.
>>>
>>> On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>
>>> wrote:
>>>
>>>> Hi guys.
>>>>
>>>> I try to fix zombie container as this email. It works well when I
>>>> restart mesos-slave. No zombie containers occur. But this just works on
>>>> restarting mesos-slave.
>>>>
>>>> If I restart the executor, the executor will quit, and the container
>>>> which executor start, will be a zombie container.
>>>>
>>>> Any idea about this? My mesos version is 0.28.
>>>>
>>>> Here is some pic:
>>>>
>>>>
>>>> 1. Start a container.
>>>>
>>>>
>>>> 2. Restart mesos-slave. Everything is ok.
>>>>
>>>> 3. Kill the executor container, zombie container occur.
>>>>
>>>>
>>>> How can I fix this?
>>>>
>>>> Thanks,
>>>> Xiaodong
>>>>
>>>> 发件人: tommy xiao <xi...@gmail.com>
>>>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>>>> 日期: 2016年11月22日 星期二 上午12:32
>>>> 至: user <us...@mesos.apache.org>
>>>> 主题: Re: Running mesos-slave in the docker that leave many zombie
>>>> process
>>>>
>>>> you need it --pid=host
>>>>
>>>> 2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>:
>>>>
>>>>> Thanks @haosdent, let me try it.
>>>>>
>>>>> 2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>:
>>>>>
>>>>>> Pass the `--pid=host` flag when starting the docker container may
>>>>>> resolve this.
>>>>>> >start the mesos_slave container with "--pid=host" so that it uses
>>>>>> the process namespace of the host.
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com> wrote:
>>>>>>
>>>>>>> No sure if it related to this issue https://github.com/mesos
>>>>>>> phere/docker-containers/issues/9
>>>>>>>
>>>>>>> On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I meet a problem when running mesos-slave in the docker. Here are
>>>>>>>> some zombie process in this way.
>>>>>>>>
>>>>>>>> ```
>>>>>>>> root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>>> ```
>>>>>>>>
>>>>>>>> And I find the zombies come from mesos-slave process:
>>>>>>>>
>>>>>>>> ```
>>>>>>>> pstree -p -s 10547
>>>>>>>> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
>>>>>>>> ```
>>>>>>>>
>>>>>>>> The logs has been deleted by the cron job a few weeks ago, but I
>>>>>>>> remember so many `Failed to shutdown socket with fd xx: Transport
>>>>>>>> endpoint is not connected` in the log.
>>>>>>>>
>>>>>>>> I report this to the JIRA: https://issues.apache.org/jira
>>>>>>>> /browse/MESOS-6615
>>>>>>>>
>>>>>>>> Is there anyone saw this issue before ?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Haosdent Huang
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Haosdent Huang
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Deshi Xiao
>>>> Twitter: xds2000
>>>> E-mail: xiaods(AT)gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by Xiaodong Zhang <xd...@alauda.io>.
Thank you for your quick response! @Haosdent.
This answers my confusion. I will do more tests.
Best regards.
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月24日 星期五 上午10:27
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
>Mesos still can’t prevent the zombie container occurs, Right
I don' think so. Usually mesos-docker-executor would not exit while the corresponding container running unless you kill it. If it exits unexpected, it should be a bug.
On Fri, Feb 24, 2017 at 10:17 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi Haosdent,
Thanks for your reply.
What you mean about "remove executor", do you mean kill `mesos-docker-executor`?
If you mean this, it is an expected behavior. Because the docker container is reaped by mesos-docker-executor, if you kill it.
Mesos agent isn't aware of the status of docker container (the status of the docker container is checked by mesos-docker-executor and send task statuses to the mesos agent ).
If this is an expected behavior. Mesos still can’t prevent the zombie container occurs, Right? If mesas-docker-executor exit itself, what does mesos-slave do?
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月24日 星期五 上午12:52
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong
>If I restart mesos-slave. Then the zombie container are gone.
Yep, during recovering stage, mesos-agent would remove those containers they don't wait
> Remove executor
What you mean about "remove executor", do you mean kill `mesos-docker-executor`?
If you mean this, it is an expected behavior. Because the docker container is reaped by mesos-docker-executor, if you kill it.
Mesos agent isn't aware of the status of docker container (the status of the docker container is checked by mesos-docker-executor and send task statuses to the mesos agent ).
On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
What can I do for this. Do you guys need more info?
发件人: Xiaodong Zhang <xd...@alauda.io>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 下午6:18
至: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi @Haosdent thanks for your reply.
I tried 1.0.3, 1.1.0. They both have the same problem.
1. Create container.
[cid:89116669-04F7-4B75-A1C5-9067FC6CE59D]
2. Restart container. Works well.
3. Remove executor
[cid:2F546391-5FD8-4676-ADE3-35F29B8F7326]
If I restart mesos-slave. Then the zombie container gone.
Any thoughts?
Thanks,
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 上午1:19
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong May you try if this problem still exists after 1.0? I remember Mesos change the recovery for docker containers to avoid this after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by haosdent <ha...@gmail.com>.
>Mesos still can’t prevent the zombie container occurs, Right
I don' think so. Usually mesos-docker-executor would not exit while the
corresponding container running unless you kill it. If it exits unexpected,
it should be a bug.
On Fri, Feb 24, 2017 at 10:17 AM, Xiaodong Zhang <xd...@alauda.io> wrote:
> Hi Haosdent,
>
> Thanks for your reply.
>
> What you mean about "remove executor", do you mean kill
> `mesos-docker-executor`?
> If you mean this, it is an expected behavior. Because the docker container
> is reaped by mesos-docker-executor, if you kill it.
> Mesos agent isn't aware of the status of docker container (the status of
> the docker container is checked by mesos-docker-executor and send task
> statuses to the mesos agent ).
>
> If this is an expected behavior. Mesos still can’t prevent the zombie
> container occurs, Right? If mesas-docker-executor exit itself, what does
> mesos-slave do?
>
> 发件人: haosdent <ha...@gmail.com>
> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
> 日期: 2017年2月24日 星期五 上午12:52
>
> 至: user <us...@mesos.apache.org>
> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>
> Hi, @xiaodong
>
> >If I restart mesos-slave. Then the zombie container are gone.
>
> Yep, during recovering stage, mesos-agent would remove those containers
> they don't wait
>
> > Remove executor
>
> What you mean about "remove executor", do you mean kill
> `mesos-docker-executor`?
> If you mean this, it is an expected behavior. Because the docker container
> is reaped by mesos-docker-executor, if you kill it.
> Mesos agent isn't aware of the status of docker container (the status of
> the docker container is checked by mesos-docker-executor and send task
> statuses to the mesos agent ).
>
>
> On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io>
> wrote:
>
>> What can I do for this. Do you guys need more info?
>>
>> 发件人: Xiaodong Zhang <xd...@alauda.io>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2017年2月21日 星期二 下午6:18
>> 至: "user@mesos.apache.org" <us...@mesos.apache.org>
>>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> Hi @Haosdent thanks for your reply.
>> I tried 1.0.3, 1.1.0. They both have the same problem.
>>
>> 1. Create container.
>>
>>
>> 2. Restart container. Works well.
>>
>> 3. Remove executor
>>
>>
>> If I restart mesos-slave. Then the zombie container gone.
>>
>> Any thoughts?
>>
>> Thanks,
>> Xiaodong
>>
>> 发件人: haosdent <ha...@gmail.com>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2017年2月21日 星期二 上午1:19
>> 至: user <us...@mesos.apache.org>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> Hi, @xiaodong May you try if this problem still exists after 1.0? I
>> remember Mesos change the recovery for docker containers to avoid this
>> after 1.0.
>>
>> On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>
>> wrote:
>>
>>> Hi guys.
>>>
>>> I try to fix zombie container as this email. It works well when I
>>> restart mesos-slave. No zombie containers occur. But this just works on
>>> restarting mesos-slave.
>>>
>>> If I restart the executor, the executor will quit, and the container
>>> which executor start, will be a zombie container.
>>>
>>> Any idea about this? My mesos version is 0.28.
>>>
>>> Here is some pic:
>>>
>>>
>>> 1. Start a container.
>>>
>>>
>>> 2. Restart mesos-slave. Everything is ok.
>>>
>>> 3. Kill the executor container, zombie container occur.
>>>
>>>
>>> How can I fix this?
>>>
>>> Thanks,
>>> Xiaodong
>>>
>>> 发件人: tommy xiao <xi...@gmail.com>
>>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>>> 日期: 2016年11月22日 星期二 上午12:32
>>> 至: user <us...@mesos.apache.org>
>>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>>
>>> you need it --pid=host
>>>
>>> 2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>:
>>>
>>>> Thanks @haosdent, let me try it.
>>>>
>>>> 2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>:
>>>>
>>>>> Pass the `--pid=host` flag when starting the docker container may
>>>>> resolve this.
>>>>> >start the mesos_slave container with "--pid=host" so that it uses
>>>>> the process namespace of the host.
>>>>>
>>>>> On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com> wrote:
>>>>>
>>>>>> No sure if it related to this issue https://github.com/mesos
>>>>>> phere/docker-containers/issues/9
>>>>>>
>>>>>> On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I meet a problem when running mesos-slave in the docker. Here are
>>>>>>> some zombie process in this way.
>>>>>>>
>>>>>>> ```
>>>>>>> root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>>> ```
>>>>>>>
>>>>>>> And I find the zombies come from mesos-slave process:
>>>>>>>
>>>>>>> ```
>>>>>>> pstree -p -s 10547
>>>>>>> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
>>>>>>> ```
>>>>>>>
>>>>>>> The logs has been deleted by the cron job a few weeks ago, but I
>>>>>>> remember so many `Failed to shutdown socket with fd xx: Transport
>>>>>>> endpoint is not connected` in the log.
>>>>>>>
>>>>>>> I report this to the JIRA: https://issues.apache.org/jira
>>>>>>> /browse/MESOS-6615
>>>>>>>
>>>>>>> Is there anyone saw this issue before ?
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Haosdent Huang
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Deshi Xiao
>>> Twitter: xds2000
>>> E-mail: xiaods(AT)gmail.com
>>>
>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by Xiaodong Zhang <xd...@alauda.io>.
Hi Haosdent,
Thanks for your reply.
What you mean about "remove executor", do you mean kill `mesos-docker-executor`?
If you mean this, it is an expected behavior. Because the docker container is reaped by mesos-docker-executor, if you kill it.
Mesos agent isn't aware of the status of docker container (the status of the docker container is checked by mesos-docker-executor and send task statuses to the mesos agent ).
If this is an expected behavior. Mesos still can’t prevent the zombie container occurs, Right? If mesas-docker-executor exit itself, what does mesos-slave do?
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月24日 星期五 上午12:52
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong
>If I restart mesos-slave. Then the zombie container are gone.
Yep, during recovering stage, mesos-agent would remove those containers they don't wait
> Remove executor
What you mean about "remove executor", do you mean kill `mesos-docker-executor`?
If you mean this, it is an expected behavior. Because the docker container is reaped by mesos-docker-executor, if you kill it.
Mesos agent isn't aware of the status of docker container (the status of the docker container is checked by mesos-docker-executor and send task statuses to the mesos agent ).
On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
What can I do for this. Do you guys need more info?
发件人: Xiaodong Zhang <xd...@alauda.io>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 下午6:18
至: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi @Haosdent thanks for your reply.
I tried 1.0.3, 1.1.0. They both have the same problem.
1. Create container.
[cid:89116669-04F7-4B75-A1C5-9067FC6CE59D]
2. Restart container. Works well.
3. Remove executor
[cid:2F546391-5FD8-4676-ADE3-35F29B8F7326]
If I restart mesos-slave. Then the zombie container gone.
Any thoughts?
Thanks,
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 上午1:19
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong May you try if this problem still exists after 1.0? I remember Mesos change the recovery for docker containers to avoid this after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by haosdent <ha...@gmail.com>.
Hi, @xiaodong
>If I restart mesos-slave. Then the zombie container are gone.
Yep, during recovering stage, mesos-agent would remove those containers
they don't wait
> Remove executor
What you mean about "remove executor", do you mean kill
`mesos-docker-executor`?
If you mean this, it is an expected behavior. Because the docker container
is reaped by mesos-docker-executor, if you kill it.
Mesos agent isn't aware of the status of docker container (the status of
the docker container is checked by mesos-docker-executor and send task
statuses to the mesos agent ).
On Thu, Feb 23, 2017 at 10:44 AM, Xiaodong Zhang <xd...@alauda.io> wrote:
> What can I do for this. Do you guys need more info?
>
> 发件人: Xiaodong Zhang <xd...@alauda.io>
> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
> 日期: 2017年2月21日 星期二 下午6:18
> 至: "user@mesos.apache.org" <us...@mesos.apache.org>
>
> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>
> Hi @Haosdent thanks for your reply.
> I tried 1.0.3, 1.1.0. They both have the same problem.
>
> 1. Create container.
>
>
> 2. Restart container. Works well.
>
> 3. Remove executor
>
>
> If I restart mesos-slave. Then the zombie container gone.
>
> Any thoughts?
>
> Thanks,
> Xiaodong
>
> 发件人: haosdent <ha...@gmail.com>
> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
> 日期: 2017年2月21日 星期二 上午1:19
> 至: user <us...@mesos.apache.org>
> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>
> Hi, @xiaodong May you try if this problem still exists after 1.0? I
> remember Mesos change the recovery for docker containers to avoid this
> after 1.0.
>
> On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io> wrote:
>
>> Hi guys.
>>
>> I try to fix zombie container as this email. It works well when I restart
>> mesos-slave. No zombie containers occur. But this just works on
>> restarting mesos-slave.
>>
>> If I restart the executor, the executor will quit, and the container
>> which executor start, will be a zombie container.
>>
>> Any idea about this? My mesos version is 0.28.
>>
>> Here is some pic:
>>
>>
>> 1. Start a container.
>>
>>
>> 2. Restart mesos-slave. Everything is ok.
>>
>> 3. Kill the executor container, zombie container occur.
>>
>>
>> How can I fix this?
>>
>> Thanks,
>> Xiaodong
>>
>> 发件人: tommy xiao <xi...@gmail.com>
>> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
>> 日期: 2016年11月22日 星期二 上午12:32
>> 至: user <us...@mesos.apache.org>
>> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>>
>> you need it --pid=host
>>
>> 2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>:
>>
>>> Thanks @haosdent, let me try it.
>>>
>>> 2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>:
>>>
>>>> Pass the `--pid=host` flag when starting the docker container may
>>>> resolve this.
>>>> >start the mesos_slave container with "--pid=host" so that it uses the
>>>> process namespace of the host.
>>>>
>>>> On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com> wrote:
>>>>
>>>>> No sure if it related to this issue https://github.com/mesos
>>>>> phere/docker-containers/issues/9
>>>>>
>>>>> On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I meet a problem when running mesos-slave in the docker. Here are
>>>>>> some zombie process in this way.
>>>>>>
>>>>>> ```
>>>>>> root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>>> ```
>>>>>>
>>>>>> And I find the zombies come from mesos-slave process:
>>>>>>
>>>>>> ```
>>>>>> pstree -p -s 10547
>>>>>> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
>>>>>> ```
>>>>>>
>>>>>> The logs has been deleted by the cron job a few weeks ago, but I
>>>>>> remember so many `Failed to shutdown socket with fd xx: Transport
>>>>>> endpoint is not connected` in the log.
>>>>>>
>>>>>> I report this to the JIRA: https://issues.apache.org/jira
>>>>>> /browse/MESOS-6615
>>>>>>
>>>>>> Is there anyone saw this issue before ?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>
>>>
>>
>>
>> --
>> Deshi Xiao
>> Twitter: xds2000
>> E-mail: xiaods(AT)gmail.com
>>
>
>
>
> --
> Best Regards,
> Haosdent Huang
>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by Xiaodong Zhang <xd...@alauda.io>.
What can I do for this. Do you guys need more info?
发件人: Xiaodong Zhang <xd...@alauda.io>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 下午6:18
至: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi @Haosdent thanks for your reply.
I tried 1.0.3, 1.1.0. They both have the same problem.
1. Create container.
[cid:89116669-04F7-4B75-A1C5-9067FC6CE59D]
2. Restart container. Works well.
3. Remove executor
[cid:2F546391-5FD8-4676-ADE3-35F29B8F7326]
If I restart mesos-slave. Then the zombie container gone.
Any thoughts?
Thanks,
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 上午1:19
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong May you try if this problem still exists after 1.0? I remember Mesos change the recovery for docker containers to avoid this after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by Xiaodong Zhang <xd...@alauda.io>.
Hi guys,
About this issue.
What can I do, or are there any other info I can offer?
Thanks,
Xiaodong
发件人: Xiaodong Zhang <xd...@alauda.io>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 下午6:18
至: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi @Haosdent thanks for your reply.
I tried 1.0.3, 1.1.0. They both have the same problem.
1. Create container.
[cid:89116669-04F7-4B75-A1C5-9067FC6CE59D]
2. Restart container. Works well.
3. Remove executor
[cid:2F546391-5FD8-4676-ADE3-35F29B8F7326]
If I restart mesos-slave. Then the zombie container gone.
Any thoughts?
Thanks,
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 上午1:19
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong May you try if this problem still exists after 1.0? I remember Mesos change the recovery for docker containers to avoid this after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by Xiaodong Zhang <xd...@alauda.io>.
Hi @Haosdent thanks for your reply.
I tried 1.0.3, 1.1.0. They both have the same problem.
1. Create container.
[cid:89116669-04F7-4B75-A1C5-9067FC6CE59D]
2. Restart container. Works well.
3. Remove executor
[cid:2F546391-5FD8-4676-ADE3-35F29B8F7326]
If I restart mesos-slave. Then the zombie container gone.
Any thoughts?
Thanks,
Xiaodong
发件人: haosdent <ha...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2017年2月21日 星期二 上午1:19
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
Hi, @xiaodong May you try if this problem still exists after 1.0? I remember Mesos change the recovery for docker containers to avoid this after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io>> wrote:
Hi guys.
I try to fix zombie container as this email. It works well when I restart mesos-slave. No zombie containers occur. But this just works on restarting mesos-slave.
If I restart the executor, the executor will quit, and the container which executor start, will be a zombie container.
Any idea about this? My mesos version is 0.28.
Here is some pic:
1. Start a container.
[cid:C1C8B5B9-DE97-486B-9512-117DCD305B3F]
2. Restart mesos-slave. Everything is ok.
[cid:41C46ACA-5234-418E-8ACC-B1FB50533AF4]
3. Kill the executor container, zombie container occur.
[cid:B6685AC2-E65B-4F16-AC3B-7C4636C96D10]
How can I fix this?
Thanks,
Xiaodong
发件人: tommy xiao <xi...@gmail.com>>
答复: "user@mesos.apache.org<ma...@mesos.apache.org>" <us...@mesos.apache.org>>
日期: 2016年11月22日 星期二 上午12:32
至: user <us...@mesos.apache.org>>
主题: Re: Running mesos-slave in the docker that leave many zombie process
you need it --pid=host
2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>>:
Thanks @haosdent, let me try it.
2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>>:
Pass the `--pid=host` flag when starting the docker container may resolve this.
>start the mesos_slave container with "--pid=host" so that it uses the process namespace of the host.
On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com>> wrote:
No sure if it related to this issue https://github.com/mesosphere/docker-containers/issues/9
On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com>> wrote:
Hi,
I meet a problem when running mesos-slave in the docker. Here are some zombie process in this way.
```
root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
```
And I find the zombies come from mesos-slave process:
```
pstree -p -s 10547
systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
```
The logs has been deleted by the cron job a few weeks ago, but I remember so many `Failed to shutdown socket with fd xx: Transport endpoint is not connected` in the log.
I report this to the JIRA: https://issues.apache.org/jira/browse/MESOS-6615
Is there anyone saw this issue before ?
--
Best Regards,
Haosdent Huang
--
Best Regards,
Haosdent Huang
--
Deshi Xiao
Twitter: xds2000
E-mail: xiaods(AT)gmail.com<http://gmail.com>
--
Best Regards,
Haosdent Huang
Re: Running mesos-slave in the docker that leave many zombie process
Posted by haosdent <ha...@gmail.com>.
Hi, @xiaodong May you try if this problem still exists after 1.0? I
remember Mesos change the recovery for docker containers to avoid this
after 1.0.
On Tue, Feb 21, 2017 at 1:13 AM, Xiaodong Zhang <xd...@alauda.io> wrote:
> Hi guys.
>
> I try to fix zombie container as this email. It works well when I restart
> mesos-slave. No zombie containers occur. But this just works on
> restarting mesos-slave.
>
> If I restart the executor, the executor will quit, and the container which
> executor start, will be a zombie container.
>
> Any idea about this? My mesos version is 0.28.
>
> Here is some pic:
>
>
> 1. Start a container.
>
>
> 2. Restart mesos-slave. Everything is ok.
>
> 3. Kill the executor container, zombie container occur.
>
>
> How can I fix this?
>
> Thanks,
> Xiaodong
>
> 发件人: tommy xiao <xi...@gmail.com>
> 答复: "user@mesos.apache.org" <us...@mesos.apache.org>
> 日期: 2016年11月22日 星期二 上午12:32
> 至: user <us...@mesos.apache.org>
> 主题: Re: Running mesos-slave in the docker that leave many zombie process
>
> you need it --pid=host
>
> 2016-11-21 15:01 GMT+08:00 X Brick <ng...@gmail.com>:
>
>> Thanks @haosdent, let me try it.
>>
>> 2016-11-21 14:33 GMT+08:00 haosdent <ha...@gmail.com>:
>>
>>> Pass the `--pid=host` flag when starting the docker container may
>>> resolve this.
>>> >start the mesos_slave container with "--pid=host" so that it uses the
>>> process namespace of the host.
>>>
>>> On Mon, Nov 21, 2016 at 2:30 PM, haosdent <ha...@gmail.com> wrote:
>>>
>>>> No sure if it related to this issue https://github.com/mesos
>>>> phere/docker-containers/issues/9
>>>>
>>>> On Mon, Nov 21, 2016 at 12:27 PM, X Brick <ng...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I meet a problem when running mesos-slave in the docker. Here are some
>>>>> zombie process in this way.
>>>>>
>>>>> ```
>>>>> root 10547 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> root 14505 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> root 16069 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> root 19962 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> root 23346 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> root 24544 19464 0 Oct25 ? 00:00:00 [docker] <defunct>
>>>>> ```
>>>>>
>>>>> And I find the zombies come from mesos-slave process:
>>>>>
>>>>> ```
>>>>> pstree -p -s 10547
>>>>> systemd(1)───docker-containe(19448)───mesos-slave(19464)───docker(10547)
>>>>> ```
>>>>>
>>>>> The logs has been deleted by the cron job a few weeks ago, but I
>>>>> remember so many `Failed to shutdown socket with fd xx: Transport
>>>>> endpoint is not connected` in the log.
>>>>>
>>>>> I report this to the JIRA: https://issues.apache.org/jira
>>>>> /browse/MESOS-6615
>>>>>
>>>>> Is there anyone saw this issue before ?
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>
>>
>
>
> --
> Deshi Xiao
> Twitter: xds2000
> E-mail: xiaods(AT)gmail.com
>
--
Best Regards,
Haosdent Huang