You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Prabhav Singh <pr...@gmail.com> on 2022/09/01 07:14:10 UTC

Issue in Kubernetes Session Deployment - Non Native

Hi,

I have been trying to deploy Flink on K8s using the Non Native Session
Deployment as described in the Documentation for Flink - 1.13.5 (
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
)

I had 2 doubts:

1. Do we need to create Persistent Volume storage for this deployment? The
job-manager-deployment and task-manager-deployment YAML files have this
section:


How do we create the flink-config-volume?

volumes:
- name: flink-config-volume
configMap:
name: flink-config
items:
- key: flink-conf.yaml
path: flink-conf.yaml
- key: log4j-console.properties
path: log4j-console.properties

2. If not, why do I keep getting this error when running the example
jobs - *Note
that the job is able to complete BUT the error causes logging issues during
the runtime of the job.*

Error - ERROR StatusLogger Reconfiguration failed: No configuration found
for '677327b6' at 'null' in 'null'

[image: Screenshot 2022-09-01 at 12.43.21 PM.png]

I can't seem to fix this and would appreciate help!

Thanks & Regards,
Prabhav

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Prabhav Singh <pr...@gmail.com>.
Hi All,

I figured out the issue. @Yang was correct regarding adding the
log4j-cli.properties to the config map. Apart from that, I also had to add
a few other properties files to the configmap.

After adding those, jobs run with correct logging and metric tracking even
through the jobmanager. Would recommend this to be added to the
*flink-configuration-configmap.yaml
*so that this issue can be resolved in further updates. I can help
contribute if needed.

Thanks & Regards,
Prabhav Singh

On Mon, Sep 5, 2022 at 5:22 PM Yang Wang <da...@gmail.com> wrote:

> Maybe you also need to add "log4j-cli.properties"[1] to
> *flink-configuration-configmap.yaml* so that the Flink client could show
> the logs normally.
>
> [1].
> https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j-cli.properties
>
> Best,
> Yang
>
>
>
> Prabhav Singh <pr...@gmail.com> 于2022年9月5日周一 12:45写道:
>
>> Thanks for the reply Matthias!
>>
>> Yes I know that the correct way to submit a job is different. Basically
>> what I did was - Exec into the jobmanager pod and run an example with
>> ./bin/flink run command. I had to remove the security group parameter in
>> the config yaml file for this to work. I will try running the jobs the
>> correct way and let you know of the results
>>
>> Regarding the error, I did some digging into the /opt/flink/config
>> directory in the Job Manager Pod. Turns out it only has two files -
>> log4j-sesson file and a properties file. I compared this to a docker image
>> I had created by pulling the apache-flink:1.13.6 image from and adding
>> PyFlink to it. The docker image /opt/flink/conf folder had multiple other
>> files including XML files for the logs which I think should also be created
>> in the Job Manager Pod. However, the /conf directory in the Job Manager Pod
>> is read-only so I could not test it.
>>
>> Do let me know if there is a way to test this by overriding the read-only
>> access.
>>
>> Regards,
>> Prabhav
>>
>> On Fri, Sep 2, 2022 at 8:37 PM Matthias Pohl <ma...@aiven.io>
>> wrote:
>>
>>> You shouldn't submit the job from within the JobManager pod. You can
>>> find different ways to access the Flink cluster in the documentation under
>>> Accessing Flink in Kubernetes [1].
>>> I experimented with your way of doing it: I could reproduce the error
>>> message but wasn't able to submit a job because (I guess) the JobManager
>>> endpoint is not configured properly for accessing the Flink cluster from
>>> within the pod. I'm a bit surprised that you were able to submit the job
>>> based on my experiment.
>>>
>>> Additionally, I'm kind of puzzled as to what the error message actually
>>> indicates. All I found about the error message was related to the log4j
>>> configuration not being set up properly. But I cannot relate this to my
>>> reasoning about the Flink configuration not being set up for calls from
>>> within the JobManager pod. Maybe, Chesnay can elaborate a bit on that one?
>>>
>>> I hope that helps.
>>> Matthias
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#accessing-flink-in-kubernetes
>>>
>>> On Fri, Sep 2, 2022 at 6:03 AM Prabhav Singh <
>>> prabhavsingh55221@gmail.com> wrote:
>>>
>>>> Hi Matthias,
>>>>
>>>> Thanks for the reply. I have applied the config map as well. For
>>>> reference, I will describe the exact steps I followed:
>>>>
>>>>    1. Used the session deployment K8 YAML files described in Apache
>>>>    Flink 1.13.6 Docs [LINK 1
>>>>    <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode>]
>>>>    - We used the apache/flink:1.13.6-java11 image for the same. Specifically:
>>>>    1. Applied the Config Map exactly as in the Appendix.
>>>>       2. Applied the Job Service YAML file.
>>>>       3. Applied the Session Job Deployment File.
>>>>       4. Applied the Task Manager Job Deployment File.
>>>>    2. Used kubectl exec to move into the job manager pod.
>>>>    3. Ran the Word Count Example in /opt/flink using - ./bin/flink run
>>>>    /examples/tables/WordCount.jar
>>>>    4. The job completes and gives the output but also gives the
>>>>    following error before starting:
>>>>
>>>> *Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>>> found for '677327b6' at 'null' in 'null'*
>>>>
>>>> This error specifically effects jobs that run continuously since the
>>>> Flink Web UI is unable to display any metrics while the job is running. It
>>>> only shows the metrics once it completes.
>>>>
>>>> Regards,
>>>> Prabhav Singh
>>>>
>>>> On Thu, Sep 1, 2022 at 6:36 PM Matthias Pohl <ma...@aiven.io>
>>>> wrote:
>>>>
>>>>> Hi Prabhav,
>>>>> not sure whether I understand you correctly, but the
>>>>> "flink-config-volume" volume definition which you see in the pod
>>>>> configuration for the JobManager and the TaskManager refers to the
>>>>> corresponding entries "flink-conf.yaml" and "log4j-console.properties" in
>>>>> the ConfigMap called "flink-config". This loads the ConfigMap entries into
>>>>> the corresponding files [1].
>>>>>
>>>>> You have to create the ConfigMap as described in the docs' Appendix
>>>>> section "Common cluster resource definitions" [2]. Have you done that?
>>>>>
>>>>> Best,
>>>>> Matthias
>>>>>
>>>>> [1]
>>>>> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
>>>>> [2]
>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>>>>>
>>>>> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <
>>>>> prabhavsingh55221@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have been trying to deploy Flink on K8s using the Non Native
>>>>>> Session Deployment as described in the Documentation for Flink - 1.13.5 (
>>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>>>>>> )
>>>>>>
>>>>>> I had 2 doubts:
>>>>>>
>>>>>> 1. Do we need to create Persistent Volume storage for this
>>>>>> deployment? The job-manager-deployment and task-manager-deployment YAML
>>>>>> files have this section:
>>>>>>
>>>>>>
>>>>>> How do we create the flink-config-volume?
>>>>>>
>>>>>> volumes:
>>>>>> - name: flink-config-volume
>>>>>> configMap:
>>>>>> name: flink-config
>>>>>> items:
>>>>>> - key: flink-conf.yaml
>>>>>> path: flink-conf.yaml
>>>>>> - key: log4j-console.properties
>>>>>> path: log4j-console.properties
>>>>>>
>>>>>> 2. If not, why do I keep getting this error when running the example
>>>>>> jobs - *Note that the job is able to complete BUT the error causes
>>>>>> logging issues during the runtime of the job.*
>>>>>>
>>>>>> Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>>>>> found for '677327b6' at 'null' in 'null'
>>>>>>
>>>>>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>>>>>
>>>>>> I can't seem to fix this and would appreciate help!
>>>>>>
>>>>>> Thanks & Regards,
>>>>>> Prabhav
>>>>>>
>>>>>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Yang Wang <da...@gmail.com>.
Maybe you also need to add "log4j-cli.properties"[1] to
*flink-configuration-configmap.yaml* so that the Flink client could show
the logs normally.

[1].
https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j-cli.properties

Best,
Yang



Prabhav Singh <pr...@gmail.com> 于2022年9月5日周一 12:45写道:

> Thanks for the reply Matthias!
>
> Yes I know that the correct way to submit a job is different. Basically
> what I did was - Exec into the jobmanager pod and run an example with
> ./bin/flink run command. I had to remove the security group parameter in
> the config yaml file for this to work. I will try running the jobs the
> correct way and let you know of the results
>
> Regarding the error, I did some digging into the /opt/flink/config
> directory in the Job Manager Pod. Turns out it only has two files -
> log4j-sesson file and a properties file. I compared this to a docker image
> I had created by pulling the apache-flink:1.13.6 image from and adding
> PyFlink to it. The docker image /opt/flink/conf folder had multiple other
> files including XML files for the logs which I think should also be created
> in the Job Manager Pod. However, the /conf directory in the Job Manager Pod
> is read-only so I could not test it.
>
> Do let me know if there is a way to test this by overriding the read-only
> access.
>
> Regards,
> Prabhav
>
> On Fri, Sep 2, 2022 at 8:37 PM Matthias Pohl <ma...@aiven.io>
> wrote:
>
>> You shouldn't submit the job from within the JobManager pod. You can find
>> different ways to access the Flink cluster in the documentation under
>> Accessing Flink in Kubernetes [1].
>> I experimented with your way of doing it: I could reproduce the error
>> message but wasn't able to submit a job because (I guess) the JobManager
>> endpoint is not configured properly for accessing the Flink cluster from
>> within the pod. I'm a bit surprised that you were able to submit the job
>> based on my experiment.
>>
>> Additionally, I'm kind of puzzled as to what the error message actually
>> indicates. All I found about the error message was related to the log4j
>> configuration not being set up properly. But I cannot relate this to my
>> reasoning about the Flink configuration not being set up for calls from
>> within the JobManager pod. Maybe, Chesnay can elaborate a bit on that one?
>>
>> I hope that helps.
>> Matthias
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#accessing-flink-in-kubernetes
>>
>> On Fri, Sep 2, 2022 at 6:03 AM Prabhav Singh <pr...@gmail.com>
>> wrote:
>>
>>> Hi Matthias,
>>>
>>> Thanks for the reply. I have applied the config map as well. For
>>> reference, I will describe the exact steps I followed:
>>>
>>>    1. Used the session deployment K8 YAML files described in Apache
>>>    Flink 1.13.6 Docs [LINK 1
>>>    <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode>]
>>>    - We used the apache/flink:1.13.6-java11 image for the same. Specifically:
>>>    1. Applied the Config Map exactly as in the Appendix.
>>>       2. Applied the Job Service YAML file.
>>>       3. Applied the Session Job Deployment File.
>>>       4. Applied the Task Manager Job Deployment File.
>>>    2. Used kubectl exec to move into the job manager pod.
>>>    3. Ran the Word Count Example in /opt/flink using - ./bin/flink run
>>>    /examples/tables/WordCount.jar
>>>    4. The job completes and gives the output but also gives the
>>>    following error before starting:
>>>
>>> *Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>> found for '677327b6' at 'null' in 'null'*
>>>
>>> This error specifically effects jobs that run continuously since the
>>> Flink Web UI is unable to display any metrics while the job is running. It
>>> only shows the metrics once it completes.
>>>
>>> Regards,
>>> Prabhav Singh
>>>
>>> On Thu, Sep 1, 2022 at 6:36 PM Matthias Pohl <ma...@aiven.io>
>>> wrote:
>>>
>>>> Hi Prabhav,
>>>> not sure whether I understand you correctly, but the
>>>> "flink-config-volume" volume definition which you see in the pod
>>>> configuration for the JobManager and the TaskManager refers to the
>>>> corresponding entries "flink-conf.yaml" and "log4j-console.properties" in
>>>> the ConfigMap called "flink-config". This loads the ConfigMap entries into
>>>> the corresponding files [1].
>>>>
>>>> You have to create the ConfigMap as described in the docs' Appendix
>>>> section "Common cluster resource definitions" [2]. Have you done that?
>>>>
>>>> Best,
>>>> Matthias
>>>>
>>>> [1]
>>>> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
>>>> [2]
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>>>>
>>>> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <
>>>> prabhavsingh55221@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have been trying to deploy Flink on K8s using the Non Native Session
>>>>> Deployment as described in the Documentation for Flink - 1.13.5 (
>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>>>>> )
>>>>>
>>>>> I had 2 doubts:
>>>>>
>>>>> 1. Do we need to create Persistent Volume storage for this deployment?
>>>>> The job-manager-deployment and task-manager-deployment YAML files have this
>>>>> section:
>>>>>
>>>>>
>>>>> How do we create the flink-config-volume?
>>>>>
>>>>> volumes:
>>>>> - name: flink-config-volume
>>>>> configMap:
>>>>> name: flink-config
>>>>> items:
>>>>> - key: flink-conf.yaml
>>>>> path: flink-conf.yaml
>>>>> - key: log4j-console.properties
>>>>> path: log4j-console.properties
>>>>>
>>>>> 2. If not, why do I keep getting this error when running the example
>>>>> jobs - *Note that the job is able to complete BUT the error causes
>>>>> logging issues during the runtime of the job.*
>>>>>
>>>>> Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>>>> found for '677327b6' at 'null' in 'null'
>>>>>
>>>>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>>>>
>>>>> I can't seem to fix this and would appreciate help!
>>>>>
>>>>> Thanks & Regards,
>>>>> Prabhav
>>>>>
>>>>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Prabhav Singh <pr...@gmail.com>.
Thanks for the reply Matthias!

Yes I know that the correct way to submit a job is different. Basically
what I did was - Exec into the jobmanager pod and run an example with
./bin/flink run command. I had to remove the security group parameter in
the config yaml file for this to work. I will try running the jobs the
correct way and let you know of the results

Regarding the error, I did some digging into the /opt/flink/config
directory in the Job Manager Pod. Turns out it only has two files -
log4j-sesson file and a properties file. I compared this to a docker image
I had created by pulling the apache-flink:1.13.6 image from and adding
PyFlink to it. The docker image /opt/flink/conf folder had multiple other
files including XML files for the logs which I think should also be created
in the Job Manager Pod. However, the /conf directory in the Job Manager Pod
is read-only so I could not test it.

Do let me know if there is a way to test this by overriding the read-only
access.

Regards,
Prabhav

On Fri, Sep 2, 2022 at 8:37 PM Matthias Pohl <ma...@aiven.io> wrote:

> You shouldn't submit the job from within the JobManager pod. You can find
> different ways to access the Flink cluster in the documentation under
> Accessing Flink in Kubernetes [1].
> I experimented with your way of doing it: I could reproduce the error
> message but wasn't able to submit a job because (I guess) the JobManager
> endpoint is not configured properly for accessing the Flink cluster from
> within the pod. I'm a bit surprised that you were able to submit the job
> based on my experiment.
>
> Additionally, I'm kind of puzzled as to what the error message actually
> indicates. All I found about the error message was related to the log4j
> configuration not being set up properly. But I cannot relate this to my
> reasoning about the Flink configuration not being set up for calls from
> within the JobManager pod. Maybe, Chesnay can elaborate a bit on that one?
>
> I hope that helps.
> Matthias
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#accessing-flink-in-kubernetes
>
> On Fri, Sep 2, 2022 at 6:03 AM Prabhav Singh <pr...@gmail.com>
> wrote:
>
>> Hi Matthias,
>>
>> Thanks for the reply. I have applied the config map as well. For
>> reference, I will describe the exact steps I followed:
>>
>>    1. Used the session deployment K8 YAML files described in Apache
>>    Flink 1.13.6 Docs [LINK 1
>>    <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode>]
>>    - We used the apache/flink:1.13.6-java11 image for the same. Specifically:
>>    1. Applied the Config Map exactly as in the Appendix.
>>       2. Applied the Job Service YAML file.
>>       3. Applied the Session Job Deployment File.
>>       4. Applied the Task Manager Job Deployment File.
>>    2. Used kubectl exec to move into the job manager pod.
>>    3. Ran the Word Count Example in /opt/flink using - ./bin/flink run
>>    /examples/tables/WordCount.jar
>>    4. The job completes and gives the output but also gives the
>>    following error before starting:
>>
>> *Error - ERROR StatusLogger Reconfiguration failed: No configuration
>> found for '677327b6' at 'null' in 'null'*
>>
>> This error specifically effects jobs that run continuously since the
>> Flink Web UI is unable to display any metrics while the job is running. It
>> only shows the metrics once it completes.
>>
>> Regards,
>> Prabhav Singh
>>
>> On Thu, Sep 1, 2022 at 6:36 PM Matthias Pohl <ma...@aiven.io>
>> wrote:
>>
>>> Hi Prabhav,
>>> not sure whether I understand you correctly, but the
>>> "flink-config-volume" volume definition which you see in the pod
>>> configuration for the JobManager and the TaskManager refers to the
>>> corresponding entries "flink-conf.yaml" and "log4j-console.properties" in
>>> the ConfigMap called "flink-config". This loads the ConfigMap entries into
>>> the corresponding files [1].
>>>
>>> You have to create the ConfigMap as described in the docs' Appendix
>>> section "Common cluster resource definitions" [2]. Have you done that?
>>>
>>> Best,
>>> Matthias
>>>
>>> [1]
>>> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
>>> [2]
>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>>>
>>> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <
>>> prabhavsingh55221@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have been trying to deploy Flink on K8s using the Non Native Session
>>>> Deployment as described in the Documentation for Flink - 1.13.5 (
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>>>> )
>>>>
>>>> I had 2 doubts:
>>>>
>>>> 1. Do we need to create Persistent Volume storage for this deployment?
>>>> The job-manager-deployment and task-manager-deployment YAML files have this
>>>> section:
>>>>
>>>>
>>>> How do we create the flink-config-volume?
>>>>
>>>> volumes:
>>>> - name: flink-config-volume
>>>> configMap:
>>>> name: flink-config
>>>> items:
>>>> - key: flink-conf.yaml
>>>> path: flink-conf.yaml
>>>> - key: log4j-console.properties
>>>> path: log4j-console.properties
>>>>
>>>> 2. If not, why do I keep getting this error when running the example
>>>> jobs - *Note that the job is able to complete BUT the error causes
>>>> logging issues during the runtime of the job.*
>>>>
>>>> Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>>> found for '677327b6' at 'null' in 'null'
>>>>
>>>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>>>
>>>> I can't seem to fix this and would appreciate help!
>>>>
>>>> Thanks & Regards,
>>>> Prabhav
>>>>
>>>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Matthias Pohl via user <us...@flink.apache.org>.
You shouldn't submit the job from within the JobManager pod. You can find
different ways to access the Flink cluster in the documentation under
Accessing Flink in Kubernetes [1].
I experimented with your way of doing it: I could reproduce the error
message but wasn't able to submit a job because (I guess) the JobManager
endpoint is not configured properly for accessing the Flink cluster from
within the pod. I'm a bit surprised that you were able to submit the job
based on my experiment.

Additionally, I'm kind of puzzled as to what the error message actually
indicates. All I found about the error message was related to the log4j
configuration not being set up properly. But I cannot relate this to my
reasoning about the Flink configuration not being set up for calls from
within the JobManager pod. Maybe, Chesnay can elaborate a bit on that one?

I hope that helps.
Matthias

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#accessing-flink-in-kubernetes

On Fri, Sep 2, 2022 at 6:03 AM Prabhav Singh <pr...@gmail.com>
wrote:

> Hi Matthias,
>
> Thanks for the reply. I have applied the config map as well. For
> reference, I will describe the exact steps I followed:
>
>    1. Used the session deployment K8 YAML files described in Apache Flink
>    1.13.6 Docs [LINK 1
>    <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode>]
>    - We used the apache/flink:1.13.6-java11 image for the same. Specifically:
>    1. Applied the Config Map exactly as in the Appendix.
>       2. Applied the Job Service YAML file.
>       3. Applied the Session Job Deployment File.
>       4. Applied the Task Manager Job Deployment File.
>    2. Used kubectl exec to move into the job manager pod.
>    3. Ran the Word Count Example in /opt/flink using - ./bin/flink run
>    /examples/tables/WordCount.jar
>    4. The job completes and gives the output but also gives the following
>    error before starting:
>
> *Error - ERROR StatusLogger Reconfiguration failed: No configuration found
> for '677327b6' at 'null' in 'null'*
>
> This error specifically effects jobs that run continuously since the Flink
> Web UI is unable to display any metrics while the job is running. It only
> shows the metrics once it completes.
>
> Regards,
> Prabhav Singh
>
> On Thu, Sep 1, 2022 at 6:36 PM Matthias Pohl <ma...@aiven.io>
> wrote:
>
>> Hi Prabhav,
>> not sure whether I understand you correctly, but the
>> "flink-config-volume" volume definition which you see in the pod
>> configuration for the JobManager and the TaskManager refers to the
>> corresponding entries "flink-conf.yaml" and "log4j-console.properties" in
>> the ConfigMap called "flink-config". This loads the ConfigMap entries into
>> the corresponding files [1].
>>
>> You have to create the ConfigMap as described in the docs' Appendix
>> section "Common cluster resource definitions" [2]. Have you done that?
>>
>> Best,
>> Matthias
>>
>> [1]
>> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
>> [2]
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>>
>> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <pr...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have been trying to deploy Flink on K8s using the Non Native Session
>>> Deployment as described in the Documentation for Flink - 1.13.5 (
>>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>>> )
>>>
>>> I had 2 doubts:
>>>
>>> 1. Do we need to create Persistent Volume storage for this deployment?
>>> The job-manager-deployment and task-manager-deployment YAML files have this
>>> section:
>>>
>>>
>>> How do we create the flink-config-volume?
>>>
>>> volumes:
>>> - name: flink-config-volume
>>> configMap:
>>> name: flink-config
>>> items:
>>> - key: flink-conf.yaml
>>> path: flink-conf.yaml
>>> - key: log4j-console.properties
>>> path: log4j-console.properties
>>>
>>> 2. If not, why do I keep getting this error when running the example
>>> jobs - *Note that the job is able to complete BUT the error causes
>>> logging issues during the runtime of the job.*
>>>
>>> Error - ERROR StatusLogger Reconfiguration failed: No configuration
>>> found for '677327b6' at 'null' in 'null'
>>>
>>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>>
>>> I can't seem to fix this and would appreciate help!
>>>
>>> Thanks & Regards,
>>> Prabhav
>>>
>>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Prabhav Singh <pr...@gmail.com>.
Hi Matthias,

Thanks for the reply. I have applied the config map as well. For reference,
I will describe the exact steps I followed:

   1. Used the session deployment K8 YAML files described in Apache Flink
   1.13.6 Docs [LINK 1
   <https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode>]
   - We used the apache/flink:1.13.6-java11 image for the same. Specifically:
   1. Applied the Config Map exactly as in the Appendix.
      2. Applied the Job Service YAML file.
      3. Applied the Session Job Deployment File.
      4. Applied the Task Manager Job Deployment File.
   2. Used kubectl exec to move into the job manager pod.
   3. Ran the Word Count Example in /opt/flink using - ./bin/flink run
   /examples/tables/WordCount.jar
   4. The job completes and gives the output but also gives the following
   error before starting:

*Error - ERROR StatusLogger Reconfiguration failed: No configuration found
for '677327b6' at 'null' in 'null'*

This error specifically effects jobs that run continuously since the Flink
Web UI is unable to display any metrics while the job is running. It only
shows the metrics once it completes.

Regards,
Prabhav Singh

On Thu, Sep 1, 2022 at 6:36 PM Matthias Pohl <ma...@aiven.io> wrote:

> Hi Prabhav,
> not sure whether I understand you correctly, but the "flink-config-volume"
> volume definition which you see in the pod configuration for the JobManager
> and the TaskManager refers to the corresponding entries "flink-conf.yaml"
> and "log4j-console.properties" in the ConfigMap called "flink-config". This
> loads the ConfigMap entries into the corresponding files [1].
>
> You have to create the ConfigMap as described in the docs' Appendix
> section "Common cluster resource definitions" [2]. Have you done that?
>
> Best,
> Matthias
>
> [1]
> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
> [2]
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>
> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <pr...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have been trying to deploy Flink on K8s using the Non Native Session
>> Deployment as described in the Documentation for Flink - 1.13.5 (
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>> )
>>
>> I had 2 doubts:
>>
>> 1. Do we need to create Persistent Volume storage for this deployment?
>> The job-manager-deployment and task-manager-deployment YAML files have this
>> section:
>>
>>
>> How do we create the flink-config-volume?
>>
>> volumes:
>> - name: flink-config-volume
>> configMap:
>> name: flink-config
>> items:
>> - key: flink-conf.yaml
>> path: flink-conf.yaml
>> - key: log4j-console.properties
>> path: log4j-console.properties
>>
>> 2. If not, why do I keep getting this error when running the example jobs
>> - *Note that the job is able to complete BUT the error causes logging
>> issues during the runtime of the job.*
>>
>> Error - ERROR StatusLogger Reconfiguration failed: No configuration found
>> for '677327b6' at 'null' in 'null'
>>
>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>
>> I can't seem to fix this and would appreciate help!
>>
>> Thanks & Regards,
>> Prabhav
>>
>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Gyula Fóra <gy...@gmail.com>.
In addition to what Matthias pointed out, you could also try the Flink
Kubernetes Operator [1] to make configuration and cluster management
simpler :)

Gyula

[1]
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/

On Thu, 1 Sep 2022 at 08:07, Matthias Pohl via user <us...@flink.apache.org>
wrote:

> Hi Prabhav,
> not sure whether I understand you correctly, but the "flink-config-volume"
> volume definition which you see in the pod configuration for the JobManager
> and the TaskManager refers to the corresponding entries "flink-conf.yaml"
> and "log4j-console.properties" in the ConfigMap called "flink-config". This
> loads the ConfigMap entries into the corresponding files [1].
>
> You have to create the ConfigMap as described in the docs' Appendix
> section "Common cluster resource definitions" [2]. Have you done that?
>
> Best,
> Matthias
>
> [1]
> https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
> [2]
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions
>
> On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <pr...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have been trying to deploy Flink on K8s using the Non Native Session
>> Deployment as described in the Documentation for Flink - 1.13.5 (
>> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
>> )
>>
>> I had 2 doubts:
>>
>> 1. Do we need to create Persistent Volume storage for this deployment?
>> The job-manager-deployment and task-manager-deployment YAML files have this
>> section:
>>
>>
>> How do we create the flink-config-volume?
>>
>> volumes:
>> - name: flink-config-volume
>> configMap:
>> name: flink-config
>> items:
>> - key: flink-conf.yaml
>> path: flink-conf.yaml
>> - key: log4j-console.properties
>> path: log4j-console.properties
>>
>> 2. If not, why do I keep getting this error when running the example jobs
>> - *Note that the job is able to complete BUT the error causes logging
>> issues during the runtime of the job.*
>>
>> Error - ERROR StatusLogger Reconfiguration failed: No configuration found
>> for '677327b6' at 'null' in 'null'
>>
>> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>>
>> I can't seem to fix this and would appreciate help!
>>
>> Thanks & Regards,
>> Prabhav
>>
>

Re: Issue in Kubernetes Session Deployment - Non Native

Posted by Matthias Pohl via user <us...@flink.apache.org>.
Hi Prabhav,
not sure whether I understand you correctly, but the "flink-config-volume"
volume definition which you see in the pod configuration for the JobManager
and the TaskManager refers to the corresponding entries "flink-conf.yaml"
and "log4j-console.properties" in the ConfigMap called "flink-config". This
loads the ConfigMap entries into the corresponding files [1].

You have to create the ConfigMap as described in the docs' Appendix section
"Common cluster resource definitions" [2]. Have you done that?

Best,
Matthias

[1]
https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#populate-a-volume-with-data-stored-in-a-configmap
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions

On Thu, Sep 1, 2022 at 9:15 AM Prabhav Singh <pr...@gmail.com>
wrote:

> Hi,
>
> I have been trying to deploy Flink on K8s using the Non Native Session
> Deployment as described in the Documentation for Flink - 1.13.5 (
> https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/
> )
>
> I had 2 doubts:
>
> 1. Do we need to create Persistent Volume storage for this deployment? The
> job-manager-deployment and task-manager-deployment YAML files have this
> section:
>
>
> How do we create the flink-config-volume?
>
> volumes:
> - name: flink-config-volume
> configMap:
> name: flink-config
> items:
> - key: flink-conf.yaml
> path: flink-conf.yaml
> - key: log4j-console.properties
> path: log4j-console.properties
>
> 2. If not, why do I keep getting this error when running the example jobs
> - *Note that the job is able to complete BUT the error causes logging
> issues during the runtime of the job.*
>
> Error - ERROR StatusLogger Reconfiguration failed: No configuration found
> for '677327b6' at 'null' in 'null'
>
> [image: Screenshot 2022-09-01 at 12.43.21 PM.png]
>
> I can't seem to fix this and would appreciate help!
>
> Thanks & Regards,
> Prabhav
>