You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Emilien Kenler <em...@cryptact.com> on 2021/02/01 01:45:37 UTC

Re: Configuring ephemeral storage limits when using Native Kubernetes

Hello,

I think this would solve our problem.
We are also looking at supporting affinity rules, and it would also cover it.

I'm going to try to find some time this week to try your patch.

Thanks
________________________________
From: Yang Wang <da...@gmail.com>
Sent: Friday, January 29, 2021 6:20 PM
To: Emilien Kenler <em...@cryptact.com>
Cc: user@flink.apache.org <us...@flink.apache.org>
Subject: Re: Configuring ephemeral storage limits when using Native Kubernetes

Hi Emilien,

Thanks for trying the native Flink integration.

Unfortunately, we still do not have the ability to set the ephemeral storage limit. I think it could
be supported via pod template[1]. I am still working on this ticket and already have a draft PR[2].
I believe it could be supported in release 1.13 and could be backported to 1.12 if necessary.

You could have a following pod template to set ephemeral storage limit after FLINK-15656 is merged.

apiVersion: v1
kind: Pod
metadata:
  name: pod-template
spec:
  initContainers:
  - name: artifacts-fetcher
    image: reg.docker.alibaba-inc.com/k8s-yiqi/artifact-fetcher:latest<http://reg.docker.alibaba-inc.com/k8s-yiqi/artifact-fetcher:latest>
    imagePullPolicy: Always
    # Use wget or other tools to get user jars from remote storage
    command: ['wget', 'http://path/of/your.jar', '-O' , '/flink-artifact/myjob.jar']
    volumeMounts:
    - mountPath: /flink-artifact
      name: flink-artifact
  containers:
    # Do not change the main container name
  - name: flink-job-manager
    volumeMounts:
    - mountPath: /opt/flink/usrlib
      name: flink-artifact
    - mountPath: /opt/flink/log
      name: flink-logs
  volumes:
  - name: flink-artifact
    emptyDir:
      sizeLimit: "1Gi"
  - name: flink-logs
    emptyDir:
      sizeLimit: "1Gi"


[1]. https://issues.apache.org/jira/browse/FLINK-15656
[2]. https://github.com/apache/flink/pull/14629

Best,
Yang

Emilien Kenler <em...@cryptact.com>> 于2021年1月29日周五 上午8:14写道:
Hello,

I'm trying to run Flink on Kubernetes, and I recently switched from lyft/flinkk8soperator to the Flink Native Kubernetes deployment mode.

I have a long running job, that I want to deploy (using application mode), and after a few hours, I noticed the deployment was disappearing.
After a quick look at the logs, it seems that the job manager was no longer to talk with the task manager after a while, because those were evicted by Kubernetes due to using more ephemeral storage than allowed.

We have limit ranges set per namespace with low default value, and each application deployed on Kubernetes needs to set values appropriate depending on its usage.
I couldn't find a way to configure those via Flink configuration.

Is there a way to set ephemeral storage requests and limits?
Are external resources supposed to help here?
If there is currently no way to do it, should it be added to the scope of FLINK-20324<https://issues.apache.org/jira/browse/FLINK-20324> ?

Thanks,
Emilien

Re: Configuring ephemeral storage limits when using Native Kubernetes

Posted by Yang Wang <da...@gmail.com>.
Thanks for testing the pod template. I really hope to get more feedbacks
from your use case.

Best,
Yang

Emilien Kenler <em...@cryptact.com> 于2021年2月1日周一 上午9:45写道:

> Hello,
>
> I think this would solve our problem.
> We are also looking at supporting affinity rules, and it would also cover
> it.
>
> I'm going to try to find some time this week to try your patch.
>
> Thanks
> ------------------------------
> *From:* Yang Wang <da...@gmail.com>
> *Sent:* Friday, January 29, 2021 6:20 PM
> *To:* Emilien Kenler <em...@cryptact.com>
> *Cc:* user@flink.apache.org <us...@flink.apache.org>
> *Subject:* Re: Configuring ephemeral storage limits when using Native
> Kubernetes
>
> Hi Emilien,
>
> Thanks for trying the native Flink integration.
>
> Unfortunately, we still do not have the ability to set the ephemeral
> storage limit. I think it could
> be supported via pod template[1]. I am still working on this ticket and
> already have a draft PR[2].
> I believe it could be supported in release 1.13 and could be backported to
> 1.12 if necessary.
>
> You could have a following pod template to set ephemeral storage limit
> after FLINK-15656 is merged.
>
> apiVersion: v1
> kind: Pod
> metadata:
>   name: pod-template
> spec:
>   initContainers:
>   - name: artifacts-fetcher
>     image: reg.docker.alibaba-inc.com/k8s-yiqi/artifact-fetcher:latest
>     imagePullPolicy: Always
>     # Use wget or other tools to get user jars from remote storage
>     command: ['wget', 'http://path/of/your.jar', '-O' ,
> '/flink-artifact/myjob.jar']
>     volumeMounts:
>     - mountPath: /flink-artifact
>       name: flink-artifact
>   containers:
>     # Do not change the main container name
>   - name: flink-job-manager
>     volumeMounts:
>     - mountPath: /opt/flink/usrlib
>       name: flink-artifact
>     - mountPath: /opt/flink/log
>       name: flink-logs
>   volumes:
>   - name: flink-artifact
>     emptyDir:
>       sizeLimit: "1Gi"
>   - name: flink-logs
>     emptyDir:
>       sizeLimit: "1Gi"
>
>
> [1]. https://issues.apache.org/jira/browse/FLINK-15656
> [2]. https://github.com/apache/flink/pull/14629
>
> Best,
> Yang
>
> Emilien Kenler <em...@cryptact.com> 于2021年1月29日周五 上午8:14写道:
>
> Hello,
>
> I'm trying to run Flink on Kubernetes, and I recently switched
> from lyft/flinkk8soperator to the Flink Native Kubernetes deployment mode.
>
> I have a long running job, that I want to deploy (using application mode),
> and after a few hours, I noticed the deployment was disappearing.
> After a quick look at the logs, it seems that the job manager was no
> longer to talk with the task manager after a while, because those were
> evicted by Kubernetes due to using more ephemeral storage than allowed.
>
> We have limit ranges set per namespace with low default value, and each
> application deployed on Kubernetes needs to set values appropriate
> depending on its usage.
> I couldn't find a way to configure those via Flink configuration.
>
> Is there a way to set ephemeral storage requests and limits?
> Are external resources supposed to help here?
> If there is currently no way to do it, should it be added to the scope of
> FLINK-20324 <https://issues.apache.org/jira/browse/FLINK-20324> ?
>
> Thanks,
> Emilien
>
>