You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Ankur Goenka <go...@google.com> on 2018/10/03 21:18:29 UTC

Add cleanup flag to DockerPayload

Hi,

In portable flink runner, SDK Harness docker containers are created
dynamically and are not garbage collected. SDK Harness container pull the
staging artifact, generate logs and tmp files which is stored as an
additional layer on top of image.
These dead container layers accumulates over time and make the machine go
OOM. To avoid this situation and keep the flexibility of letting containers
exist for debugging, I am planning to add cleanup flag to DockerPayload.
When set, this flag will tell runner to remove the container after killing.
Current DockerPayload:

// The payload of a Docker image
message DockerPayload {
  string container_image = 1;  // implicitly linux_amd64.
}


Proposed DockerPayload:

// The payload of a Docker image
message DockerPayload {
  string container_image = 1;  // implicitly linux_amd64.
  bool cleanup = 2; // Flag to signal container deletion after killing.
}


Let me know your thoughts and if there is a better way to do this.

Thanks,
Ankur

Re: Add cleanup flag to DockerPayload

Posted by Maximilian Michels <mx...@apache.org>.
+1 for a cleanup flag. If unclear to anyone, this is about the disk 
space, not about stopping containers after the run.

Containers should always be cleaned up unless the flag is specified.

-Max

On 04.10.18 01:28, Ankur Goenka wrote:
> Seems reasonable. A pipeline option should suffice for this purpose.
> 
> On Wed, Oct 3, 2018 at 3:02 PM Henning Rohde <herohde@google.com 
> <ma...@google.com>> wrote:
> 
>     IMO it's the runner's responsibility to do container garbage
>     collection and disk space management. This flag seems like a
>     implementation-specific option that would not only to some
>     runner/deployment combinations, so it doesn't seem to belong in that
>     proto. Dataflow would not be able to honor such a flag, for example.
>     Perhaps Flink could start all containers with --rm and have a
>     Flink-specific debug option to not do that?
> 
>     Henning
> 
>     On Wed, Oct 3, 2018 at 2:19 PM Ankur Goenka <goenka@google.com
>     <ma...@google.com>> wrote:
> 
>         Hi,
> 
>         In portable flink runner, SDK Harness docker containers are
>         created dynamically and are not garbage collected. SDK Harness
>         container pull the staging artifact, generate logs and tmp files
>         which is stored as an additional layer on top of image.
>         These dead container layers accumulates over time and make the
>         machine go OOM. To avoid this situation and keep the flexibility
>         of letting containers exist for debugging, I am planning to add
>         cleanup flag to DockerPayload.
>         When set, this flag will tell runner to remove the container
>         after killing.
>         Current DockerPayload:
> 
>         // The payload of a Docker image
>         message DockerPayload {
>            string container_image =1;// implicitly linux_amd64.
>         }
> 
> 
>         Proposed DockerPayload:
> 
>         // The payload of a Docker image
>         message DockerPayload {
>            string container_image =1;// implicitly linux_amd64.
>         bool cleanup =2;// Flag to signal container deletion after killing.
>         }
> 
> 
>         Let me know your thoughts and if there is a better way to do this.
> 
>         Thanks,
>         Ankur
> 

Re: Add cleanup flag to DockerPayload

Posted by Ankur Goenka <go...@google.com>.
Seems reasonable. A pipeline option should suffice for this purpose.

On Wed, Oct 3, 2018 at 3:02 PM Henning Rohde <he...@google.com> wrote:

> IMO it's the runner's responsibility to do container garbage collection
> and disk space management. This flag seems like a implementation-specific
> option that would not only to some runner/deployment combinations, so it
> doesn't seem to belong in that proto. Dataflow would not be able to honor
> such a flag, for example. Perhaps Flink could start all containers with
> --rm and have a Flink-specific debug option to not do that?
>
> Henning
>
> On Wed, Oct 3, 2018 at 2:19 PM Ankur Goenka <go...@google.com> wrote:
>
>> Hi,
>>
>> In portable flink runner, SDK Harness docker containers are created
>> dynamically and are not garbage collected. SDK Harness container pull the
>> staging artifact, generate logs and tmp files which is stored as an
>> additional layer on top of image.
>> These dead container layers accumulates over time and make the machine go
>> OOM. To avoid this situation and keep the flexibility of letting containers
>> exist for debugging, I am planning to add cleanup flag to DockerPayload.
>> When set, this flag will tell runner to remove the container after
>> killing.
>> Current DockerPayload:
>>
>> // The payload of a Docker image
>> message DockerPayload {
>>   string container_image = 1;  // implicitly linux_amd64.
>> }
>>
>>
>> Proposed DockerPayload:
>>
>> // The payload of a Docker image
>> message DockerPayload {
>>   string container_image = 1;  // implicitly linux_amd64.
>>   bool cleanup = 2; // Flag to signal container deletion after killing.
>> }
>>
>>
>> Let me know your thoughts and if there is a better way to do this.
>>
>> Thanks,
>> Ankur
>>
>

Re: Add cleanup flag to DockerPayload

Posted by Henning Rohde <he...@google.com>.
IMO it's the runner's responsibility to do container garbage collection and
disk space management. This flag seems like a implementation-specific
option that would not only to some runner/deployment combinations, so it
doesn't seem to belong in that proto. Dataflow would not be able to honor
such a flag, for example. Perhaps Flink could start all containers with
--rm and have a Flink-specific debug option to not do that?

Henning

On Wed, Oct 3, 2018 at 2:19 PM Ankur Goenka <go...@google.com> wrote:

> Hi,
>
> In portable flink runner, SDK Harness docker containers are created
> dynamically and are not garbage collected. SDK Harness container pull the
> staging artifact, generate logs and tmp files which is stored as an
> additional layer on top of image.
> These dead container layers accumulates over time and make the machine go
> OOM. To avoid this situation and keep the flexibility of letting containers
> exist for debugging, I am planning to add cleanup flag to DockerPayload.
> When set, this flag will tell runner to remove the container after killing.
> Current DockerPayload:
>
> // The payload of a Docker image
> message DockerPayload {
>   string container_image = 1;  // implicitly linux_amd64.
> }
>
>
> Proposed DockerPayload:
>
> // The payload of a Docker image
> message DockerPayload {
>   string container_image = 1;  // implicitly linux_amd64.
>   bool cleanup = 2; // Flag to signal container deletion after killing.
> }
>
>
> Let me know your thoughts and if there is a better way to do this.
>
> Thanks,
> Ankur
>