You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Lucas Caparelli <lu...@gympass.com> on 2022/11/30 21:35:20 UTC

flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Hello folks,

Not sure if this is the best list for this, sorry if it isn't. I'd
appreciate some pointers :-)

When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
through several failures to write into $FLINK_HOME/conf/. We believe this
is due to this volume being mounted from a ConfigMap, which means it's
read-only.

This has been reported in the past in GCP's operator, but I was unable to
find any kind of resolution for it:
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213

In our use case, we want to set an API key as part of the flink-conf.yaml
file, but we don't want it to be persisted in Kubernetes or in our version
control, since it's sensitive data. This API Key is used by Flink to report
metrics to Datadog [3].

We have automation in place which allows us to accomplish this by setting
environment variables pointing to a path in our secret manager, which only
gets injected during runtime. That part is working fine.

However, we're trying to inject this secret using the FLINK_PROPERTIES
variable, which is appended [4] to the flink-conf.yaml file in the
docker-entrypoint script, which fails due to the filesystem where the file
is being read-only.

We attempted working around this in 2 different ways:

- providing our own .spec.containers[0].command, where we copied over
/opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
because the operator overwrote it and replaced it with its original
command/args;
- providing an initContainer sharing the volumes so it could make the
copy without being overridden by the operator's command/args. This did not
work because the initContainer present in the spec never makes it to the
resulting Deployment, it seems the operator ignores it.

We have some questions:

1. Is this overriding of the pod template present in FlinkDeployment
intentional? That is, should our custom command/args and initContainers
have been overwritten? If so, I find it a bit confusing that these fields
are present and available for use at all.
2. Since the ConfigMap volume will always be mounted as read-only, it seems
to me there's some adjustments to be made in order for this script to work
correctly. Do you think it would make sense for the script to copy over
contents from the ConfigMap volume to a writable directory during
initialization, and then use this copy for any subsequent operation?
Perhaps copying over to $FLINK_HOME, which the user could set themselves,
maybe even with a sane default which wouldn't fail on writes (eg
/tmp/flink).

Thanks in advance for your attention and hard work on the project!

[1]: https://github.com/apache/flink-kubernetes-operator
[2]:
https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
[3]: https://docs.datadoghq.com/integrations/flink/
[4]:
https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Andrew Otto <ot...@wikimedia.org>.

Ah, got it.  Thanks!

On Thu, Dec 1, 2022 at 11:34 AM Gyula Fóra <gy...@gmail.com> wrote:

> As I also mentioned in the email, this is on our roadmap for the operator
> but we have not implemented it yet because this feature only became
> available as of Flink 1.16.
>
> Ideally in the operator FlinkDeployment spec.flinkConfiguration section
> the user should be able to use env vars if this is added.
>
> Gyula
>
> On Thu, Dec 1, 2022 at 5:18 PM Andrew Otto <ot...@wikimedia.org> wrote:
>
>> > Andrew please see my previous response, that covers the secrets case.
>> > kubernetes.jobmanager.entrypoint.args: -D
>> datadog.secret.conf=$MY_SECRET_ENV
>>
>> This way^?  Ya that makes sense.  It'd be nice if there was a way to get
>> Secrets into the values used for rendering flink-conf.yaml too, so the
>> confs will be all in the same place.
>>
>>
>>
>>
>>
>> On Thu, Dec 1, 2022 at 9:30 AM Gyula Fóra <gy...@gmail.com> wrote:
>>
>>> Andrew please see my previous response, that covers the secrets case.
>>>
>>> Gyula
>>>
>>> On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <ot...@wikimedia.org> wrote:
>>>
>>>> > several failures to write into $FLINK_HOME/conf/.
>>>> I'm working on
>>>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
>>>> building Flink and flink-kubernetes-operator images for the Wikimedia
>>>> Foundation, and I found this strange as well.  It makes sense in a docker /
>>>> docker-compose only environment, but in k8s where you have ConfigMap
>>>> responsible for flink-conf.yaml, and (also logs all going to the console,
>>>> not FLINK_HOME/log), I'd prefer if the image was not modified by the
>>>> ENTRYPOINT.
>>>>
>>>> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
>>>> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
>>>> provided by flink-docker is not really needed.  It seems to be written more
>>>> for deployments outside of kubernetes.
>>>>  flink-kubernetes-operator never calls the built in subcommands (e.g.
>>>> standalone-job), and always runs in 'pass-through' mode, just execing the
>>>> args passed to it.  At WMF we build
>>>> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm
>>>> planning on removing all of the stuff in ENTRYPOINTs that mangles the
>>>> image.  Anything that I might want to keep from docker-entrypoint.sh (like enabling
>>>> jemoalloc
>>>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
>>>> I should be able to do in the Dockerfile at image creation time.
>>>>
>>>> >  want to set an API key as part of the flink-conf.yaml file, but we
>>>> don't want it to be persisted in Kubernetes or in our version control
>>>> I personally am still pretty green at k8s, but would using kubernetes
>>>> Secrets
>>>> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
>>>> work for your use case? I know we use them at WMF, but from a quick glance
>>>> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
>>>> that renders flink-conf.yaml, but I feel like there should be a way.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gy...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Lucas!
>>>>>
>>>>> The Flink kubernetes integration itself is responsible for mounting
>>>>> the configmap and overwriting the entrypoint not the operator. Therefore
>>>>> this is not something we can easily change from the operator side. However
>>>>> I think we are looking at the problem from the wrong side and there may be
>>>>> a solution already :)
>>>>>
>>>>> Ideally what you want is ENV replacement in Flink configuration. This
>>>>> is not something that the Flink community has added yet unfortunately but
>>>>> we have it on our radar for the operator at least (
>>>>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably
>>>>> be added in the next 1.4.0 version.
>>>>>
>>>>> This will be possible from Flink 1.16 which introduced a small feature
>>>>> that allows us to inject parameters to the kubernetes entrypoints:
>>>>> https://issues.apache.org/jira/browse/FLINK-29123
>>>>>
>>>>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>>>>
>>>>> While it's not implemented in the operator yet, you could try setting
>>>>> the following config in Flink 1.16.0:
>>>>> kubernetes.jobmanager.entrypoint.args: -D
>>>>> datadog.secret.conf=$MY_SECRET_ENV
>>>>> kubernetes.taskmanager.entrypoint.args: -D
>>>>> datadog.secret.conf=$MY_SECRET_ENV
>>>>>
>>>>> If you use this configuration together with the default native mode in
>>>>> the operator, it should work I believe.
>>>>>
>>>>> Please try and let me know!
>>>>> Gyula
>>>>>
>>>>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>>>>> lucas.caparelli@gympass.com> wrote:
>>>>>
>>>>>> Hello folks,
>>>>>>
>>>>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>>>>> appreciate some pointers :-)
>>>>>>
>>>>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2]
>>>>>> goes through several failures to write into $FLINK_HOME/conf/. We believe
>>>>>> this is due to this volume being mounted from a ConfigMap, which means it's
>>>>>> read-only.
>>>>>>
>>>>>> This has been reported in the past in GCP's operator, but I was
>>>>>> unable to find any kind of resolution for it:
>>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>>>>
>>>>>> In our use case, we want to set an API key as part of the
>>>>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>>>>> in our version control, since it's sensitive data. This API Key is used by
>>>>>> Flink to report metrics to Datadog [3].
>>>>>>
>>>>>> We have automation in place which allows us to accomplish this by
>>>>>> setting environment variables pointing to a path in our secret manager,
>>>>>> which only gets injected during runtime. That part is working fine.
>>>>>>
>>>>>> However, we're trying to inject this secret using the
>>>>>> FLINK_PROPERTIES variable, which is appended [4] to the flink-conf.yaml
>>>>>> file in the docker-entrypoint script, which fails due to the filesystem
>>>>>> where the file is being read-only.
>>>>>>
>>>>>> We attempted working around this in 2 different ways:
>>>>>>
>>>>>>   - providing our own .spec.containers[0].command, where we copied
>>>>>> over /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not
>>>>>> work because the operator overwrote it and replaced it with its original
>>>>>> command/args;
>>>>>>   - providing an initContainer sharing the volumes so it could make
>>>>>> the copy without being overridden by the operator's command/args. This did
>>>>>> not work because the initContainer present in the spec never makes it to
>>>>>> the resulting Deployment, it seems the operator ignores it.
>>>>>>
>>>>>> We have some questions:
>>>>>>
>>>>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>>>>> intentional? That is, should our custom command/args and initContainers
>>>>>> have been overwritten? If so, I find it a bit confusing that these fields
>>>>>> are present and available for use at all.
>>>>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>>>>> seems to me there's some adjustments to be made in order for this script to
>>>>>> work correctly. Do you think it would make sense for the script to copy
>>>>>> over contents from the ConfigMap volume to a writable directory during
>>>>>> initialization, and then use this copy for any subsequent operation?
>>>>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>>>>> maybe even with a sane default which wouldn't fail on writes (eg
>>>>>> /tmp/flink).
>>>>>>
>>>>>> Thanks in advance for your attention and hard work on the project!
>>>>>>
>>>>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>>>>> [2]:
>>>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>>>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>>>>> [4]:
>>>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>>>>
>>>>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Gyula Fóra <gy...@gmail.com>.

As I also mentioned in the email, this is on our roadmap for the operator
but we have not implemented it yet because this feature only became
available as of Flink 1.16.

Ideally in the operator FlinkDeployment spec.flinkConfiguration section the
user should be able to use env vars if this is added.

Gyula

On Thu, Dec 1, 2022 at 5:18 PM Andrew Otto <ot...@wikimedia.org> wrote:

> > Andrew please see my previous response, that covers the secrets case.
> > kubernetes.jobmanager.entrypoint.args: -D
> datadog.secret.conf=$MY_SECRET_ENV
>
> This way^?  Ya that makes sense.  It'd be nice if there was a way to get
> Secrets into the values used for rendering flink-conf.yaml too, so the
> confs will be all in the same place.
>
>
>
>
>
> On Thu, Dec 1, 2022 at 9:30 AM Gyula Fóra <gy...@gmail.com> wrote:
>
>> Andrew please see my previous response, that covers the secrets case.
>>
>> Gyula
>>
>> On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <ot...@wikimedia.org> wrote:
>>
>>> > several failures to write into $FLINK_HOME/conf/.
>>> I'm working on
>>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
>>> building Flink and flink-kubernetes-operator images for the Wikimedia
>>> Foundation, and I found this strange as well.  It makes sense in a docker /
>>> docker-compose only environment, but in k8s where you have ConfigMap
>>> responsible for flink-conf.yaml, and (also logs all going to the console,
>>> not FLINK_HOME/log), I'd prefer if the image was not modified by the
>>> ENTRYPOINT.
>>>
>>> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
>>> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
>>> provided by flink-docker is not really needed.  It seems to be written more
>>> for deployments outside of kubernetes.
>>>  flink-kubernetes-operator never calls the built in subcommands (e.g.
>>> standalone-job), and always runs in 'pass-through' mode, just execing the
>>> args passed to it.  At WMF we build
>>> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning
>>> on removing all of the stuff in ENTRYPOINTs that mangles the image.
>>> Anything that I might want to keep from docker-entrypoint.sh (like enabling
>>> jemoalloc
>>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
>>> I should be able to do in the Dockerfile at image creation time.
>>>
>>> >  want to set an API key as part of the flink-conf.yaml file, but we
>>> don't want it to be persisted in Kubernetes or in our version control
>>> I personally am still pretty green at k8s, but would using kubernetes
>>> Secrets
>>> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
>>> work for your use case? I know we use them at WMF, but from a quick glance
>>> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
>>> that renders flink-conf.yaml, but I feel like there should be a way.
>>>
>>>
>>>
>>>
>>> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gy...@gmail.com> wrote:
>>>
>>>> Hi Lucas!
>>>>
>>>> The Flink kubernetes integration itself is responsible for mounting the
>>>> configmap and overwriting the entrypoint not the operator. Therefore this
>>>> is not something we can easily change from the operator side. However I
>>>> think we are looking at the problem from the wrong side and there may be a
>>>> solution already :)
>>>>
>>>> Ideally what you want is ENV replacement in Flink configuration. This
>>>> is not something that the Flink community has added yet unfortunately but
>>>> we have it on our radar for the operator at least (
>>>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably
>>>> be added in the next 1.4.0 version.
>>>>
>>>> This will be possible from Flink 1.16 which introduced a small feature
>>>> that allows us to inject parameters to the kubernetes entrypoints:
>>>> https://issues.apache.org/jira/browse/FLINK-29123
>>>>
>>>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>>>
>>>> While it's not implemented in the operator yet, you could try setting
>>>> the following config in Flink 1.16.0:
>>>> kubernetes.jobmanager.entrypoint.args: -D
>>>> datadog.secret.conf=$MY_SECRET_ENV
>>>> kubernetes.taskmanager.entrypoint.args: -D
>>>> datadog.secret.conf=$MY_SECRET_ENV
>>>>
>>>> If you use this configuration together with the default native mode in
>>>> the operator, it should work I believe.
>>>>
>>>> Please try and let me know!
>>>> Gyula
>>>>
>>>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>>>> lucas.caparelli@gympass.com> wrote:
>>>>
>>>>> Hello folks,
>>>>>
>>>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>>>> appreciate some pointers :-)
>>>>>
>>>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2]
>>>>> goes through several failures to write into $FLINK_HOME/conf/. We believe
>>>>> this is due to this volume being mounted from a ConfigMap, which means it's
>>>>> read-only.
>>>>>
>>>>> This has been reported in the past in GCP's operator, but I was unable
>>>>> to find any kind of resolution for it:
>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>>>
>>>>> In our use case, we want to set an API key as part of the
>>>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>>>> in our version control, since it's sensitive data. This API Key is used by
>>>>> Flink to report metrics to Datadog [3].
>>>>>
>>>>> We have automation in place which allows us to accomplish this by
>>>>> setting environment variables pointing to a path in our secret manager,
>>>>> which only gets injected during runtime. That part is working fine.
>>>>>
>>>>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>>>>> variable, which is appended [4] to the flink-conf.yaml file in the
>>>>> docker-entrypoint script, which fails due to the filesystem where the file
>>>>> is being read-only.
>>>>>
>>>>> We attempted working around this in 2 different ways:
>>>>>
>>>>>   - providing our own .spec.containers[0].command, where we copied
>>>>> over /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not
>>>>> work because the operator overwrote it and replaced it with its original
>>>>> command/args;
>>>>>   - providing an initContainer sharing the volumes so it could make
>>>>> the copy without being overridden by the operator's command/args. This did
>>>>> not work because the initContainer present in the spec never makes it to
>>>>> the resulting Deployment, it seems the operator ignores it.
>>>>>
>>>>> We have some questions:
>>>>>
>>>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>>>> intentional? That is, should our custom command/args and initContainers
>>>>> have been overwritten? If so, I find it a bit confusing that these fields
>>>>> are present and available for use at all.
>>>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>>>> seems to me there's some adjustments to be made in order for this script to
>>>>> work correctly. Do you think it would make sense for the script to copy
>>>>> over contents from the ConfigMap volume to a writable directory during
>>>>> initialization, and then use this copy for any subsequent operation?
>>>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>>>> maybe even with a sane default which wouldn't fail on writes (eg
>>>>> /tmp/flink).
>>>>>
>>>>> Thanks in advance for your attention and hard work on the project!
>>>>>
>>>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>>>> [2]:
>>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>>>> [4]:
>>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>>>
>>>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Andrew Otto <ot...@wikimedia.org>.

> Andrew please see my previous response, that covers the secrets case.
> kubernetes.jobmanager.entrypoint.args: -D
datadog.secret.conf=$MY_SECRET_ENV

This way^?  Ya that makes sense.  It'd be nice if there was a way to get
Secrets into the values used for rendering flink-conf.yaml too, so the
confs will be all in the same place.





On Thu, Dec 1, 2022 at 9:30 AM Gyula Fóra <gy...@gmail.com> wrote:

> Andrew please see my previous response, that covers the secrets case.
>
> Gyula
>
> On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <ot...@wikimedia.org> wrote:
>
>> > several failures to write into $FLINK_HOME/conf/.
>> I'm working on
>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
>> building Flink and flink-kubernetes-operator images for the Wikimedia
>> Foundation, and I found this strange as well.  It makes sense in a docker /
>> docker-compose only environment, but in k8s where you have ConfigMap
>> responsible for flink-conf.yaml, and (also logs all going to the console,
>> not FLINK_HOME/log), I'd prefer if the image was not modified by the
>> ENTRYPOINT.
>>
>> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
>> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
>> provided by flink-docker is not really needed.  It seems to be written more
>> for deployments outside of kubernetes.
>>  flink-kubernetes-operator never calls the built in subcommands (e.g.
>> standalone-job), and always runs in 'pass-through' mode, just execing the
>> args passed to it.  At WMF we build
>> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning
>> on removing all of the stuff in ENTRYPOINTs that mangles the image.
>> Anything that I might want to keep from docker-entrypoint.sh (like enabling
>> jemoalloc
>> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
>> I should be able to do in the Dockerfile at image creation time.
>>
>> >  want to set an API key as part of the flink-conf.yaml file, but we
>> don't want it to be persisted in Kubernetes or in our version control
>> I personally am still pretty green at k8s, but would using kubernetes
>> Secrets
>> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
>> work for your use case? I know we use them at WMF, but from a quick glance
>> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
>> that renders flink-conf.yaml, but I feel like there should be a way.
>>
>>
>>
>>
>> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gy...@gmail.com> wrote:
>>
>>> Hi Lucas!
>>>
>>> The Flink kubernetes integration itself is responsible for mounting the
>>> configmap and overwriting the entrypoint not the operator. Therefore this
>>> is not something we can easily change from the operator side. However I
>>> think we are looking at the problem from the wrong side and there may be a
>>> solution already :)
>>>
>>> Ideally what you want is ENV replacement in Flink configuration. This is
>>> not something that the Flink community has added yet unfortunately but we
>>> have it on our radar for the operator at least (
>>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
>>> added in the next 1.4.0 version.
>>>
>>> This will be possible from Flink 1.16 which introduced a small feature
>>> that allows us to inject parameters to the kubernetes entrypoints:
>>> https://issues.apache.org/jira/browse/FLINK-29123
>>>
>>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>>
>>> While it's not implemented in the operator yet, you could try setting
>>> the following config in Flink 1.16.0:
>>> kubernetes.jobmanager.entrypoint.args: -D
>>> datadog.secret.conf=$MY_SECRET_ENV
>>> kubernetes.taskmanager.entrypoint.args: -D
>>> datadog.secret.conf=$MY_SECRET_ENV
>>>
>>> If you use this configuration together with the default native mode in
>>> the operator, it should work I believe.
>>>
>>> Please try and let me know!
>>> Gyula
>>>
>>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>>> lucas.caparelli@gympass.com> wrote:
>>>
>>>> Hello folks,
>>>>
>>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>>> appreciate some pointers :-)
>>>>
>>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
>>>> through several failures to write into $FLINK_HOME/conf/. We believe this
>>>> is due to this volume being mounted from a ConfigMap, which means it's
>>>> read-only.
>>>>
>>>> This has been reported in the past in GCP's operator, but I was unable
>>>> to find any kind of resolution for it:
>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>>
>>>> In our use case, we want to set an API key as part of the
>>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>>> in our version control, since it's sensitive data. This API Key is used by
>>>> Flink to report metrics to Datadog [3].
>>>>
>>>> We have automation in place which allows us to accomplish this by
>>>> setting environment variables pointing to a path in our secret manager,
>>>> which only gets injected during runtime. That part is working fine.
>>>>
>>>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>>>> variable, which is appended [4] to the flink-conf.yaml file in the
>>>> docker-entrypoint script, which fails due to the filesystem where the file
>>>> is being read-only.
>>>>
>>>> We attempted working around this in 2 different ways:
>>>>
>>>>   - providing our own .spec.containers[0].command, where we copied over
>>>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
>>>> because the operator overwrote it and replaced it with its original
>>>> command/args;
>>>>   - providing an initContainer sharing the volumes so it could make the
>>>> copy without being overridden by the operator's command/args. This did not
>>>> work because the initContainer present in the spec never makes it to the
>>>> resulting Deployment, it seems the operator ignores it.
>>>>
>>>> We have some questions:
>>>>
>>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>>> intentional? That is, should our custom command/args and initContainers
>>>> have been overwritten? If so, I find it a bit confusing that these fields
>>>> are present and available for use at all.
>>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>>> seems to me there's some adjustments to be made in order for this script to
>>>> work correctly. Do you think it would make sense for the script to copy
>>>> over contents from the ConfigMap volume to a writable directory during
>>>> initialization, and then use this copy for any subsequent operation?
>>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>>> maybe even with a sane default which wouldn't fail on writes (eg
>>>> /tmp/flink).
>>>>
>>>> Thanks in advance for your attention and hard work on the project!
>>>>
>>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>>> [2]:
>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>>> [4]:
>>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>>
>>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Gyula Fóra <gy...@gmail.com>.

Andrew please see my previous response, that covers the secrets case.

Gyula

On Thu, Dec 1, 2022 at 2:54 PM Andrew Otto <ot...@wikimedia.org> wrote:

> > several failures to write into $FLINK_HOME/conf/.
> I'm working on
> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
> building Flink and flink-kubernetes-operator images for the Wikimedia
> Foundation, and I found this strange as well.  It makes sense in a docker /
> docker-compose only environment, but in k8s where you have ConfigMap
> responsible for flink-conf.yaml, and (also logs all going to the console,
> not FLINK_HOME/log), I'd prefer if the image was not modified by the
> ENTRYPOINT.
>
> I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
> <https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
> provided by flink-docker is not really needed.  It seems to be written more
> for deployments outside of kubernetes.
>  flink-kubernetes-operator never calls the built in subcommands (e.g.
> standalone-job), and always runs in 'pass-through' mode, just execing the
> args passed to it.  At WMF we build
> <https://doc.wikimedia.org/docker-pkg/> our own images, so I'm planning
> on removing all of the stuff in ENTRYPOINTs that mangles the image.
> Anything that I might want to keep from docker-entrypoint.sh (like enabling
> jemoalloc
> <https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
> I should be able to do in the Dockerfile at image creation time.
>
> >  want to set an API key as part of the flink-conf.yaml file, but we
> don't want it to be persisted in Kubernetes or in our version control
> I personally am still pretty green at k8s, but would using kubernetes
> Secrets
> <https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
> work for your use case? I know we use them at WMF, but from a quick glance
> I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
> that renders flink-conf.yaml, but I feel like there should be a way.
>
>
>
>
> On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gy...@gmail.com> wrote:
>
>> Hi Lucas!
>>
>> The Flink kubernetes integration itself is responsible for mounting the
>> configmap and overwriting the entrypoint not the operator. Therefore this
>> is not something we can easily change from the operator side. However I
>> think we are looking at the problem from the wrong side and there may be a
>> solution already :)
>>
>> Ideally what you want is ENV replacement in Flink configuration. This is
>> not something that the Flink community has added yet unfortunately but we
>> have it on our radar for the operator at least (
>> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
>> added in the next 1.4.0 version.
>>
>> This will be possible from Flink 1.16 which introduced a small feature
>> that allows us to inject parameters to the kubernetes entrypoints:
>> https://issues.apache.org/jira/browse/FLINK-29123
>>
>> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>>
>> While it's not implemented in the operator yet, you could try setting the
>> following config in Flink 1.16.0:
>> kubernetes.jobmanager.entrypoint.args: -D
>> datadog.secret.conf=$MY_SECRET_ENV
>> kubernetes.taskmanager.entrypoint.args: -D
>> datadog.secret.conf=$MY_SECRET_ENV
>>
>> If you use this configuration together with the default native mode in
>> the operator, it should work I believe.
>>
>> Please try and let me know!
>> Gyula
>>
>> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
>> lucas.caparelli@gympass.com> wrote:
>>
>>> Hello folks,
>>>
>>> Not sure if this is the best list for this, sorry if it isn't. I'd
>>> appreciate some pointers :-)
>>>
>>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
>>> through several failures to write into $FLINK_HOME/conf/. We believe this
>>> is due to this volume being mounted from a ConfigMap, which means it's
>>> read-only.
>>>
>>> This has been reported in the past in GCP's operator, but I was unable
>>> to find any kind of resolution for it:
>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>>
>>> In our use case, we want to set an API key as part of the
>>> flink-conf.yaml file, but we don't want it to be persisted in Kubernetes or
>>> in our version control, since it's sensitive data. This API Key is used by
>>> Flink to report metrics to Datadog [3].
>>>
>>> We have automation in place which allows us to accomplish this by
>>> setting environment variables pointing to a path in our secret manager,
>>> which only gets injected during runtime. That part is working fine.
>>>
>>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>>> variable, which is appended [4] to the flink-conf.yaml file in the
>>> docker-entrypoint script, which fails due to the filesystem where the file
>>> is being read-only.
>>>
>>> We attempted working around this in 2 different ways:
>>>
>>>   - providing our own .spec.containers[0].command, where we copied over
>>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
>>> because the operator overwrote it and replaced it with its original
>>> command/args;
>>>   - providing an initContainer sharing the volumes so it could make the
>>> copy without being overridden by the operator's command/args. This did not
>>> work because the initContainer present in the spec never makes it to the
>>> resulting Deployment, it seems the operator ignores it.
>>>
>>> We have some questions:
>>>
>>> 1. Is this overriding of the pod template present in FlinkDeployment
>>> intentional? That is, should our custom command/args and initContainers
>>> have been overwritten? If so, I find it a bit confusing that these fields
>>> are present and available for use at all.
>>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>>> seems to me there's some adjustments to be made in order for this script to
>>> work correctly. Do you think it would make sense for the script to copy
>>> over contents from the ConfigMap volume to a writable directory during
>>> initialization, and then use this copy for any subsequent operation?
>>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>>> maybe even with a sane default which wouldn't fail on writes (eg
>>> /tmp/flink).
>>>
>>> Thanks in advance for your attention and hard work on the project!
>>>
>>> [1]: https://github.com/apache/flink-kubernetes-operator
>>> [2]:
>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>>> [3]: https://docs.datadoghq.com/integrations/flink/
>>> [4]:
>>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>>
>>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Andrew Otto <ot...@wikimedia.org>.

> several failures to write into $FLINK_HOME/conf/.
I'm working on
<https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/>
building Flink and flink-kubernetes-operator images for the Wikimedia
Foundation, and I found this strange as well.  It makes sense in a docker /
docker-compose only environment, but in k8s where you have ConfigMap
responsible for flink-conf.yaml, and (also logs all going to the console,
not FLINK_HOME/log), I'd prefer if the image was not modified by the
ENTRYPOINT.

I believe that for flink-kubernetes-operator, the docker-entrypoint.sh
<https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh>
provided by flink-docker is not really needed.  It seems to be written more
for deployments outside of kubernetes.
 flink-kubernetes-operator never calls the built in subcommands (e.g.
standalone-job), and always runs in 'pass-through' mode, just execing the
args passed to it.  At WMF we build <https://doc.wikimedia.org/docker-pkg/>
our own images, so I'm planning on removing all of the stuff in ENTRYPOINTs
that mangles the image.  Anything that I might want to keep from
docker-entrypoint.sh (like enabling jemoalloc
<https://gerrit.wikimedia.org/r/c/operations/docker-images/production-images/+/858356/6/images/flink/Dockerfile.template#73>)
I should be able to do in the Dockerfile at image creation time.

>  want to set an API key as part of the flink-conf.yaml file, but we don't
want it to be persisted in Kubernetes or in our version control
I personally am still pretty green at k8s, but would using kubernetes
Secrets
<https://kubernetes.io/docs/concepts/configuration/secret/#use-case-secret-visible-to-one-container-in-a-pod>
work for your use case? I know we use them at WMF, but from a quick glance
I'm not sure how to combine them in flink-kubernetes-operator's ConfigMap
that renders flink-conf.yaml, but I feel like there should be a way.




On Wed, Nov 30, 2022 at 4:59 PM Gyula Fóra <gy...@gmail.com> wrote:

> Hi Lucas!
>
> The Flink kubernetes integration itself is responsible for mounting the
> configmap and overwriting the entrypoint not the operator. Therefore this
> is not something we can easily change from the operator side. However I
> think we are looking at the problem from the wrong side and there may be a
> solution already :)
>
> Ideally what you want is ENV replacement in Flink configuration. This is
> not something that the Flink community has added yet unfortunately but we
> have it on our radar for the operator at least (
> https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
> added in the next 1.4.0 version.
>
> This will be possible from Flink 1.16 which introduced a small feature
> that allows us to inject parameters to the kubernetes entrypoints:
> https://issues.apache.org/jira/browse/FLINK-29123
>
> https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d
>
> While it's not implemented in the operator yet, you could try setting the
> following config in Flink 1.16.0:
> kubernetes.jobmanager.entrypoint.args: -D
> datadog.secret.conf=$MY_SECRET_ENV
> kubernetes.taskmanager.entrypoint.args: -D
> datadog.secret.conf=$MY_SECRET_ENV
>
> If you use this configuration together with the default native mode in the
> operator, it should work I believe.
>
> Please try and let me know!
> Gyula
>
> On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
> lucas.caparelli@gympass.com> wrote:
>
>> Hello folks,
>>
>> Not sure if this is the best list for this, sorry if it isn't. I'd
>> appreciate some pointers :-)
>>
>> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
>> through several failures to write into $FLINK_HOME/conf/. We believe this
>> is due to this volume being mounted from a ConfigMap, which means it's
>> read-only.
>>
>> This has been reported in the past in GCP's operator, but I was unable to
>> find any kind of resolution for it:
>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>>
>> In our use case, we want to set an API key as part of the flink-conf.yaml
>> file, but we don't want it to be persisted in Kubernetes or in our version
>> control, since it's sensitive data. This API Key is used by Flink to report
>> metrics to Datadog [3].
>>
>> We have automation in place which allows us to accomplish this by setting
>> environment variables pointing to a path in our secret manager, which only
>> gets injected during runtime. That part is working fine.
>>
>> However, we're trying to inject this secret using the FLINK_PROPERTIES
>> variable, which is appended [4] to the flink-conf.yaml file in the
>> docker-entrypoint script, which fails due to the filesystem where the file
>> is being read-only.
>>
>> We attempted working around this in 2 different ways:
>>
>>   - providing our own .spec.containers[0].command, where we copied over
>> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
>> because the operator overwrote it and replaced it with its original
>> command/args;
>>   - providing an initContainer sharing the volumes so it could make the
>> copy without being overridden by the operator's command/args. This did not
>> work because the initContainer present in the spec never makes it to the
>> resulting Deployment, it seems the operator ignores it.
>>
>> We have some questions:
>>
>> 1. Is this overriding of the pod template present in FlinkDeployment
>> intentional? That is, should our custom command/args and initContainers
>> have been overwritten? If so, I find it a bit confusing that these fields
>> are present and available for use at all.
>> 2. Since the ConfigMap volume will always be mounted as read-only, it
>> seems to me there's some adjustments to be made in order for this script to
>> work correctly. Do you think it would make sense for the script to copy
>> over contents from the ConfigMap volume to a writable directory during
>> initialization, and then use this copy for any subsequent operation?
>> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
>> maybe even with a sane default which wouldn't fail on writes (eg
>> /tmp/flink).
>>
>> Thanks in advance for your attention and hard work on the project!
>>
>> [1]: https://github.com/apache/flink-kubernetes-operator
>> [2]:
>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
>> [3]: https://docs.datadoghq.com/integrations/flink/
>> [4]:
>> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>>
>

Re: flink-kubernetes-operator: image entrypoint misbehaves due to inability to write

Posted by Gyula Fóra <gy...@gmail.com>.

Hi Lucas!

The Flink kubernetes integration itself is responsible for mounting the
configmap and overwriting the entrypoint not the operator. Therefore this
is not something we can easily change from the operator side. However I
think we are looking at the problem from the wrong side and there may be a
solution already :)

Ideally what you want is ENV replacement in Flink configuration. This is
not something that the Flink community has added yet unfortunately but we
have it on our radar for the operator at least (
https://issues.apache.org/jira/browse/FLINK-27491). It will probably be
added in the next 1.4.0 version.

This will be possible from Flink 1.16 which introduced a small feature that
allows us to inject parameters to the kubernetes entrypoints:
https://issues.apache.org/jira/browse/FLINK-29123
https://github.com/apache/flink/commit/c37643031dca2e6d4c299c0d704081a8bffece1d

While it's not implemented in the operator yet, you could try setting the
following config in Flink 1.16.0:
kubernetes.jobmanager.entrypoint.args: -D datadog.secret.conf=$MY_SECRET_ENV
kubernetes.taskmanager.entrypoint.args: -D
datadog.secret.conf=$MY_SECRET_ENV

If you use this configuration together with the default native mode in the
operator, it should work I believe.

Please try and let me know!
Gyula

On Wed, Nov 30, 2022 at 10:36 PM Lucas Caparelli <
lucas.caparelli@gympass.com> wrote:

> Hello folks,
>
> Not sure if this is the best list for this, sorry if it isn't. I'd
> appreciate some pointers :-)
>
> When using flink-kubernetes-operator [1], docker-entrypoint.sh [2] goes
> through several failures to write into $FLINK_HOME/conf/. We believe this
> is due to this volume being mounted from a ConfigMap, which means it's
> read-only.
>
> This has been reported in the past in GCP's operator, but I was unable to
> find any kind of resolution for it:
> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/213
>
> In our use case, we want to set an API key as part of the flink-conf.yaml
> file, but we don't want it to be persisted in Kubernetes or in our version
> control, since it's sensitive data. This API Key is used by Flink to report
> metrics to Datadog [3].
>
> We have automation in place which allows us to accomplish this by setting
> environment variables pointing to a path in our secret manager, which only
> gets injected during runtime. That part is working fine.
>
> However, we're trying to inject this secret using the FLINK_PROPERTIES
> variable, which is appended [4] to the flink-conf.yaml file in the
> docker-entrypoint script, which fails due to the filesystem where the file
> is being read-only.
>
> We attempted working around this in 2 different ways:
>
>   - providing our own .spec.containers[0].command, where we copied over
> /opt/flink to /tmp/flink and set FLINK_HOME=/tmp/flink. This did not work
> because the operator overwrote it and replaced it with its original
> command/args;
>   - providing an initContainer sharing the volumes so it could make the
> copy without being overridden by the operator's command/args. This did not
> work because the initContainer present in the spec never makes it to the
> resulting Deployment, it seems the operator ignores it.
>
> We have some questions:
>
> 1. Is this overriding of the pod template present in FlinkDeployment
> intentional? That is, should our custom command/args and initContainers
> have been overwritten? If so, I find it a bit confusing that these fields
> are present and available for use at all.
> 2. Since the ConfigMap volume will always be mounted as read-only, it
> seems to me there's some adjustments to be made in order for this script to
> work correctly. Do you think it would make sense for the script to copy
> over contents from the ConfigMap volume to a writable directory during
> initialization, and then use this copy for any subsequent operation?
> Perhaps copying over to $FLINK_HOME, which the user could set themselves,
> maybe even with a sane default which wouldn't fail on writes (eg
> /tmp/flink).
>
> Thanks in advance for your attention and hard work on the project!
>
> [1]: https://github.com/apache/flink-kubernetes-operator
> [2]:
> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh
> [3]: https://docs.datadoghq.com/integrations/flink/
> [4]:
> https://github.com/apache/flink-docker/blob/master/1.16/scala_2.12-java11-ubuntu/docker-entrypoint.sh#L86-L88
>