You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Andrei Budnik <ab...@mesosphere.io> on 2019/06/04 15:05:45 UTC

On adding a debug endpoint for Mesos containerizer

Hi folks,

We have been encountering container stuck issues for quite a long time.
Some of these issues are caused by external components such as CNI/CSI
plugins, custom Mesos modules, etc. Also, there were cases when a container
become stuck due to a Linux kernel bug. All these kinds of issues make it
difficult to debug container stuck issues.

We are proposing a container debug endpoint for the Mesos agent [1], which
is based on a new mechanism for tracking pending libprocess futures [2].

Please review both of them.

[1] Container debug endpoint:
https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
[2] Tracking libprocess futures:
https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY

Re: On adding a debug endpoint for Mesos containerizer

Posted by Benno Evers <be...@mesosphere.com>.
I agree, this looks pretty nice. Maybe we can elevate it to libprocess
itself if it proves useful.

On Thu, Jun 6, 2019 at 1:07 AM James Peach <jp...@apache.org> wrote:

> I really like this proposal and I think that it would help opertional
> teams a lot. Let’s make sure that it is well documented :)
>
> > On Jun 5, 2019, at 1:05 AM, Andrei Budnik <ab...@mesosphere.io> wrote:
> >
> > Hi folks,
> >
> > We have been encountering container stuck issues for quite a long time.
> Some of these issues are caused by external components such as CNI/CSI
> plugins, custom Mesos modules, etc. Also, there were cases when a container
> become stuck due to a Linux kernel bug. All these kinds of issues make it
> difficult to debug container stuck issues.
> >
> > We are proposing a container debug endpoint for the Mesos agent [1],
> which is based on a new mechanism for tracking pending libprocess futures
> [2].
> >
> > Please review both of them.
> >
> > [1] Container debug endpoint:
> https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> > [2] Tracking libprocess futures:
> https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY
>
>

-- 
Benno Evers
Software Engineer, Mesosphere

Re: On adding a debug endpoint for Mesos containerizer

Posted by Benno Evers <be...@mesosphere.com>.
I agree, this looks pretty nice. Maybe we can elevate it to libprocess
itself if it proves useful.

On Thu, Jun 6, 2019 at 1:07 AM James Peach <jp...@apache.org> wrote:

> I really like this proposal and I think that it would help opertional
> teams a lot. Let’s make sure that it is well documented :)
>
> > On Jun 5, 2019, at 1:05 AM, Andrei Budnik <ab...@mesosphere.io> wrote:
> >
> > Hi folks,
> >
> > We have been encountering container stuck issues for quite a long time.
> Some of these issues are caused by external components such as CNI/CSI
> plugins, custom Mesos modules, etc. Also, there were cases when a container
> become stuck due to a Linux kernel bug. All these kinds of issues make it
> difficult to debug container stuck issues.
> >
> > We are proposing a container debug endpoint for the Mesos agent [1],
> which is based on a new mechanism for tracking pending libprocess futures
> [2].
> >
> > Please review both of them.
> >
> > [1] Container debug endpoint:
> https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> > [2] Tracking libprocess futures:
> https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY
>
>

-- 
Benno Evers
Software Engineer, Mesosphere

Re: On adding a debug endpoint for Mesos containerizer

Posted by James Peach <jp...@apache.org>.
I really like this proposal and I think that it would help opertional teams a lot. Let’s make sure that it is well documented :)

> On Jun 5, 2019, at 1:05 AM, Andrei Budnik <ab...@mesosphere.io> wrote:
> 
> Hi folks,
> 
> We have been encountering container stuck issues for quite a long time. Some of these issues are caused by external components such as CNI/CSI plugins, custom Mesos modules, etc. Also, there were cases when a container become stuck due to a Linux kernel bug. All these kinds of issues make it difficult to debug container stuck issues.
> 
> We are proposing a container debug endpoint for the Mesos agent [1], which is based on a new mechanism for tracking pending libprocess futures [2].
> 
> Please review both of them.
> 
> [1] Container debug endpoint: https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> [2] Tracking libprocess futures: https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY


Re: On adding a debug endpoint for Mesos containerizer

Posted by James Peach <jp...@apache.org>.
I really like this proposal and I think that it would help opertional teams a lot. Let’s make sure that it is well documented :)

> On Jun 5, 2019, at 1:05 AM, Andrei Budnik <ab...@mesosphere.io> wrote:
> 
> Hi folks,
> 
> We have been encountering container stuck issues for quite a long time. Some of these issues are caused by external components such as CNI/CSI plugins, custom Mesos modules, etc. Also, there were cases when a container become stuck due to a Linux kernel bug. All these kinds of issues make it difficult to debug container stuck issues.
> 
> We are proposing a container debug endpoint for the Mesos agent [1], which is based on a new mechanism for tracking pending libprocess futures [2].
> 
> Please review both of them.
> 
> [1] Container debug endpoint: https://docs.google.com/document/d/1VtlKD6b8a22HzSdaJUeI7cPGuKd01vLwBJT4XfkeUDI
> [2] Tracking libprocess futures: https://docs.google.com/document/d/1Unu2pe0dRq3Z6XQ5S8lWZm2cU2REjfkUj0xk2ePQ0MY