You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Geldenhuys, Morgan Karl" <mo...@tu-berlin.de> on 2022/06/23 10:05:46 UTC
Advice needed: Flink Kubernetes Operator with Prometheus Configuration
Greetings all,
I am trying to deploy Flink jobs using the Flink Kubernetes Operator and I would like to have Prometheus scrape metrics from the various pods.
The jobs are created successfully, however, the metrics don't appear to be available.
The following steps were followed based on the documentation: https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/metrics-logging/#how-to-enable-prometheus-example
* The Prometheus stack is deployed successfully
* The Pod Monitor is enabled successfully
* The Flink Kubernetes operator is successfully created with following configuration is appended to the values.yaml
metrics:
port: 9999
defaultConfiguration:
create: true
append: true
flink-conf.yaml: |+
kubernetes.operator.metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
kubernetes.operator.metrics.reporter.prom.port: 9999
* The job is deployed with the following added to the flinkConfiguration
"metrics.reporter.prom.class": "org.apache.flink.metrics.prometheus.PrometheusReporter",
"metrics.reporter.prom.port": "9999"
Now on investigation, it does not appear that the metrics port 9999 is opened on the containers. The documentation is not very clear where to put the port config, but i assumed this what they meant as above. Is this correct? Is there another way of opening ports if this is not functioning as intended?
Thanks in advance!
Regards,
M.
Re: [External] Re: Advice needed: Flink Kubernetes Operator with Prometheus Configuration
Posted by Őrhidi Mátyás <ma...@gmail.com>.
Your Flink configs are seemingly not indented, I guess it simply appends
the metrics configs to that
On 2022. Jun 23., Thu at 16:30, Geldenhuys, Morgan Karl <
morgan.geldenhuys@tu-berlin.de> wrote:
> Hey,
>
>
> Ya, I dont see that being created in the container.
>
>
> Config is as such:
>
>
>
>
> Results of Kubectl get deploy flink-kubernetes-operator -o yaml
>
>
> apiVersion: apps/v1
> kind: Deployment
> metadata:
> annotations:
> deployment.kubernetes.io/revision: "1"
> meta.helm.sh/release-name: flink-kubernetes-operator
> meta.helm.sh/release-namespace: test
> creationTimestamp: "2022-06-23T10:00:40Z"
> generation: 1
> labels:
> app.kubernetes.io/managed-by: Helm
> app.kubernetes.io/name: flink-kubernetes-operator
> app.kubernetes.io/version: 1.0.0
> helm.sh/chart: flink-kubernetes-operator-1.0.0
> managedFields:
> - apiVersion: apps/v1
> fieldsType: FieldsV1
> fieldsV1:
> f:metadata:
> f:annotations:
> .: {}
> f:meta.helm.sh/release-name: {}
> f:meta.helm.sh/release-namespace: {}
> f:labels:
> .: {}
> f:app.kubernetes.io/managed-by: {}
> f:app.kubernetes.io/name: {}
> f:app.kubernetes.io/version: {}
> f:helm.sh/chart: {}
> f:spec:
> f:progressDeadlineSeconds: {}
> f:replicas: {}
> f:revisionHistoryLimit: {}
> f:selector: {}
> f:strategy:
> f:rollingUpdate:
> .: {}
> f:maxSurge: {}
> f:maxUnavailable: {}
> f:type: {}
> f:template:
> f:metadata:
> f:annotations:
> .: {}
> f:kubectl.kubernetes.io/default-container: {}
> f:labels:
> .: {}
> f:app.kubernetes.io/name: {}
> f:spec:
> f:containers:
> k:{"name":"flink-kubernetes-operator"}:
> .: {}
> f:command: {}
> f:env:
> .: {}
> k:{"name":"FLINK_CONF_DIR"}:
> .: {}
> f:name: {}
> f:value: {}
> k:{"name":"FLINK_OPERATOR_WATCH_NAMESPACES"}:
> .: {}
> f:name: {}
> k:{"name":"FLINK_PLUGINS_DIR"}:
> .: {}
> f:name: {}
> f:value: {}
> k:{"name":"JVM_ARGS"}:
> .: {}
> f:name: {}
> k:{"name":"LOG_CONFIG"}:
> .: {}
> f:name: {}
> f:value: {}
> k:{"name":"OPERATOR_NAME"}:
> .: {}
> f:name: {}
> f:value: {}
> k:{"name":"OPERATOR_NAMESPACE"}:
> .: {}
> f:name: {}
> f:value: {}
> f:image: {}
> f:imagePullPolicy: {}
> f:name: {}
> f:resources: {}
> f:securityContext: {}
> f:terminationMessagePath: {}
> f:terminationMessagePolicy: {}
> f:volumeMounts:
> .: {}
> k:{"mountPath":"/opt/flink/conf"}:
> .: {}
> f:mountPath: {}
> f:name: {}
> f:dnsPolicy: {}
> f:restartPolicy: {}
> f:schedulerName: {}
> f:securityContext:
> .: {}
> f:runAsGroup: {}
> f:runAsUser: {}
> f:serviceAccount: {}
> f:serviceAccountName: {}
> f:terminationGracePeriodSeconds: {}
> f:volumes:
> .: {}
> k:{"name":"flink-operator-config-volume"}:
> .: {}
> f:configMap:
> .: {}
> f:defaultMode: {}
> f:items: {}
> f:name: {}
> f:name: {}
> manager: helm
> operation: Update
> time: "2022-06-23T10:00:40Z"
> - apiVersion: apps/v1
> fieldsType: FieldsV1
> fieldsV1:
> f:metadata:
> f:annotations:
> f:deployment.kubernetes.io/revision: {}
> f:status:
> f:availableReplicas: {}
> f:conditions:
> .: {}
> k:{"type":"Available"}:
> .: {}
> f:lastTransitionTime: {}
> f:lastUpdateTime: {}
> f:message: {}
> f:reason: {}
> f:status: {}
> f:type: {}
> k:{"type":"Progressing"}:
> .: {}
> f:lastTransitionTime: {}
> f:lastUpdateTime: {}
> f:message: {}
> f:reason: {}
> f:status: {}
> f:type: {}
> f:observedGeneration: {}
> f:readyReplicas: {}
> f:replicas: {}
> f:updatedReplicas: {}
> manager: kube-controller-manager
> operation: Update
> subresource: status
> time: "2022-06-23T10:00:41Z"
> name: flink-kubernetes-operator
> namespace: test
> resourceVersion: "3507106"
> uid: 96955f2a-2397-4074-8982-5cb963cd62eb
> spec:
> progressDeadlineSeconds: 600
> replicas: 1
> revisionHistoryLimit: 10
> selector:
> matchLabels:
> app.kubernetes.io/name: flink-kubernetes-operator
> strategy:
> rollingUpdate:
> maxSurge: 25%
> maxUnavailable: 25%
> type: RollingUpdate
> template:
> metadata:
> annotations:
> kubectl.kubernetes.io/default-container: flink-kubernetes-operator
> creationTimestamp: null
> labels:
> app.kubernetes.io/name: flink-kubernetes-operator
> spec:
> containers:
> - command:
> - /docker-entrypoint.sh
> - operator
> env:
> - name: OPERATOR_NAMESPACE
> value: test
> - name: OPERATOR_NAME
> value: flink-kubernetes-operator
> - name: FLINK_CONF_DIR
> value: /opt/flink/conf
> - name: FLINK_PLUGINS_DIR
> value: /opt/flink/plugins
> - name: LOG_CONFIG
> value:
> -Dlog4j.configurationFile=/opt/flink/conf/log4j-operator.properties
> - name: JVM_ARGS
> - name: FLINK_OPERATOR_WATCH_NAMESPACES
> image: ghcr.io/apache/flink-kubernetes-operator:fa2cd14
> imagePullPolicy: IfNotPresent
> name: flink-kubernetes-operator
> resources: {}
> securityContext: {}
> terminationMessagePath: /dev/termination-log
> terminationMessagePolicy: File
> volumeMounts:
> - mountPath: /opt/flink/conf
> name: flink-operator-config-volume
> dnsPolicy: ClusterFirst
> restartPolicy: Always
> schedulerName: default-scheduler
> securityContext:
> runAsGroup: 9999
> runAsUser: 9999
> serviceAccount: flink-operator
> serviceAccountName: flink-operator
> terminationGracePeriodSeconds: 30
> volumes:
> - configMap:
> defaultMode: 420
> items:
> - key: flink-conf.yaml
> path: flink-conf.yaml
> - key: log4j-operator.properties
> path: log4j-operator.properties
> - key: log4j-console.properties
> path: log4j-console.properties
> name: flink-operator-config
> name: flink-operator-config-volume
> status:
> availableReplicas: 1
> conditions:
> - lastTransitionTime: "2022-06-23T10:00:41Z"
> lastUpdateTime: "2022-06-23T10:00:41Z"
> message: Deployment has minimum availability.
> reason: MinimumReplicasAvailable
> status: "True"
> type: Available
> - lastTransitionTime: "2022-06-23T10:00:40Z"
> lastUpdateTime: "2022-06-23T10:00:41Z"
> message: ReplicaSet "flink-kubernetes-operator-64d68cc5d9" has
> successfully progressed.
> reason: NewReplicaSetAvailable
> status: "True"
> type: Progressing
> observedGeneration: 1
> readyReplicas: 1
> replicas: 1
> updatedReplicas: 1
>
> ------------------------------
> *From:* Őrhidi Mátyás <ma...@gmail.com>
> *Sent:* 23 June 2022 12:25:15
> *To:* Geldenhuys, Morgan Karl
> *Cc:* user@flink.apache.org
> *Subject:* [External] Re: Advice needed: Flink Kubernetes Operator with
> Prometheus Configuration
>
>
> *Externe E-Mail:* Oeffnen Sie Anhaenge oder Links nur, wenn Sie der
> Quelle vertrauen!
> *External E-Mail:* Be cautious before clicking any link or attachment!
>
> Hi Morgan,
>
>
>
> There is a placeholder in the values.yaml:
>
> [image: image.png]
>
> This should create an entry on the operator container:
> [image: image.png]
>
> Can you share the output of this command pls?
>
> k get deploy flink-kubernetes-operator -o yaml
>
> Thanks,
> Matyas
>
> On Thu, Jun 23, 2022 at 12:12 PM Geldenhuys, Morgan Karl <
> morgan.geldenhuys@tu-berlin.de> wrote:
>
>> Greetings all,
>>
>>
>> I am trying to deploy Flink jobs using the Flink Kubernetes Operator and
>> I would like to have Prometheus scrape metrics from the various pods.
>>
>>
>> The jobs are created successfully, however, the metrics don't appear to
>> be available.
>>
>>
>> The following steps were followed based on the documentation:
>> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/metrics-logging/#how-to-enable-prometheus-example
>>
>>
>>
>> - The Prometheus stack is deployed successfully
>> - The Pod Monitor is enabled successfully
>> - The Flink Kubernetes operator is successfully created with
>> following configuration is appended to the values.yaml
>>
>>
>> metrics:
>> port: 9999
>>
>> defaultConfiguration:
>> create: true
>> append: true
>> flink-conf.yaml: |+
>> kubernetes.operator.metrics.reporter.prom.class:
>> org.apache.flink.metrics.prometheus.PrometheusReporter
>> kubernetes.operator.metrics.reporter.prom.port: 9999
>>
>>
>>
>> - The job is deployed with the following added to
>> the flinkConfiguration
>>
>>
>> "metrics.reporter.prom.class": "org.apache.flink.metrics.prometheus.PrometheusReporter",
>> "metrics.reporter.prom.port": "9999"
>>
>>
>> Now on investigation, it does not appear that the metrics port 9999 is
>> opened on the containers. The documentation is not very clear where to put
>> the port config, but i assumed this what they meant as above. Is this
>> correct? Is there another way of opening ports if this is not functioning
>> as intended?
>>
>>
>> Thanks in advance!
>>
>>
>> Regards,
>>
>> M.
>>
>>
>>
>>
Re: [External] Re: Advice needed: Flink Kubernetes Operator with Prometheus Configuration
Posted by "Geldenhuys, Morgan Karl" <mo...@tu-berlin.de>.
Hey,
Ya, I dont see that being created in the container.
Config is as such:
[cid:58a8c754-ec66-4195-ad8a-3b9f7894f682]
Results of Kubectl get deploy flink-kubernetes-operator -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
meta.helm.sh/release-name: flink-kubernetes-operator
meta.helm.sh/release-namespace: test
creationTimestamp: "2022-06-23T10:00:40Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: flink-kubernetes-operator
app.kubernetes.io/version: 1.0.0
helm.sh/chart: flink-kubernetes-operator-1.0.0
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app.kubernetes.io/managed-by: {}
f:app.kubernetes.io/name: {}
f:app.kubernetes.io/version: {}
f:helm.sh/chart: {}
f:spec:
f:progressDeadlineSeconds: {}
f:replicas: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/default-container: {}
f:labels:
.: {}
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"flink-kubernetes-operator"}:
.: {}
f:command: {}
f:env:
.: {}
k:{"name":"FLINK_CONF_DIR"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"FLINK_OPERATOR_WATCH_NAMESPACES"}:
.: {}
f:name: {}
k:{"name":"FLINK_PLUGINS_DIR"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"JVM_ARGS"}:
.: {}
f:name: {}
k:{"name":"LOG_CONFIG"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"OPERATOR_NAME"}:
.: {}
f:name: {}
f:value: {}
k:{"name":"OPERATOR_NAMESPACE"}:
.: {}
f:name: {}
f:value: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:resources: {}
f:securityContext: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:volumeMounts:
.: {}
k:{"mountPath":"/opt/flink/conf"}:
.: {}
f:mountPath: {}
f:name: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext:
.: {}
f:runAsGroup: {}
f:runAsUser: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
f:volumes:
.: {}
k:{"name":"flink-operator-config-volume"}:
.: {}
f:configMap:
.: {}
f:defaultMode: {}
f:items: {}
f:name: {}
f:name: {}
manager: helm
operation: Update
time: "2022-06-23T10:00:40Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
manager: kube-controller-manager
operation: Update
subresource: status
time: "2022-06-23T10:00:41Z"
name: flink-kubernetes-operator
namespace: test
resourceVersion: "3507106"
uid: 96955f2a-2397-4074-8982-5cb963cd62eb
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/name: flink-kubernetes-operator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: flink-kubernetes-operator
creationTimestamp: null
labels:
app.kubernetes.io/name: flink-kubernetes-operator
spec:
containers:
- command:
- /docker-entrypoint.sh
- operator
env:
- name: OPERATOR_NAMESPACE
value: test
- name: OPERATOR_NAME
value: flink-kubernetes-operator
- name: FLINK_CONF_DIR
value: /opt/flink/conf
- name: FLINK_PLUGINS_DIR
value: /opt/flink/plugins
- name: LOG_CONFIG
value: -Dlog4j.configurationFile=/opt/flink/conf/log4j-operator.properties
- name: JVM_ARGS
- name: FLINK_OPERATOR_WATCH_NAMESPACES
image: ghcr.io/apache/flink-kubernetes-operator:fa2cd14
imagePullPolicy: IfNotPresent
name: flink-kubernetes-operator
resources: {}
securityContext: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/flink/conf
name: flink-operator-config-volume
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsGroup: 9999
runAsUser: 9999
serviceAccount: flink-operator
serviceAccountName: flink-operator
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
items:
- key: flink-conf.yaml
path: flink-conf.yaml
- key: log4j-operator.properties
path: log4j-operator.properties
- key: log4j-console.properties
path: log4j-console.properties
name: flink-operator-config
name: flink-operator-config-volume
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2022-06-23T10:00:41Z"
lastUpdateTime: "2022-06-23T10:00:41Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2022-06-23T10:00:40Z"
lastUpdateTime: "2022-06-23T10:00:41Z"
message: ReplicaSet "flink-kubernetes-operator-64d68cc5d9" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
________________________________
From: Őrhidi Mátyás <ma...@gmail.com>
Sent: 23 June 2022 12:25:15
To: Geldenhuys, Morgan Karl
Cc: user@flink.apache.org
Subject: [External] Re: Advice needed: Flink Kubernetes Operator with Prometheus Configuration
Externe E-Mail: Oeffnen Sie Anhaenge oder Links nur, wenn Sie der Quelle vertrauen!
External E-Mail: Be cautious before clicking any link or attachment!
Hi Morgan,
There is a placeholder in the values.yaml:
[image.png]
This should create an entry on the operator container:
[image.png]
Can you share the output of this command pls?
k get deploy flink-kubernetes-operator -o yaml
Thanks,
Matyas
On Thu, Jun 23, 2022 at 12:12 PM Geldenhuys, Morgan Karl <mo...@tu-berlin.de>> wrote:
Greetings all,
I am trying to deploy Flink jobs using the Flink Kubernetes Operator and I would like to have Prometheus scrape metrics from the various pods.
The jobs are created successfully, however, the metrics don't appear to be available.
The following steps were followed based on the documentation: https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/metrics-logging/#how-to-enable-prometheus-example
* The Prometheus stack is deployed successfully
* The Pod Monitor is enabled successfully
* The Flink Kubernetes operator is successfully created with following configuration is appended to the values.yaml
metrics:
port: 9999
defaultConfiguration:
create: true
append: true
flink-conf.yaml: |+
kubernetes.operator.metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
kubernetes.operator.metrics.reporter.prom.port: 9999
* The job is deployed with the following added to the flinkConfiguration
"metrics.reporter.prom.class": "org.apache.flink.metrics.prometheus.PrometheusReporter",
"metrics.reporter.prom.port": "9999"
Now on investigation, it does not appear that the metrics port 9999 is opened on the containers. The documentation is not very clear where to put the port config, but i assumed this what they meant as above. Is this correct? Is there another way of opening ports if this is not functioning as intended?
Thanks in advance!
Regards,
M.
Re: Advice needed: Flink Kubernetes Operator with Prometheus Configuration
Posted by Őrhidi Mátyás <ma...@gmail.com>.
Hi Morgan,
There is a placeholder in the values.yaml:
[image: image.png]
This should create an entry on the operator container:
[image: image.png]
Can you share the output of this command pls?
k get deploy flink-kubernetes-operator -o yaml
Thanks,
Matyas
On Thu, Jun 23, 2022 at 12:12 PM Geldenhuys, Morgan Karl <
morgan.geldenhuys@tu-berlin.de> wrote:
> Greetings all,
>
>
> I am trying to deploy Flink jobs using the Flink Kubernetes Operator and I
> would like to have Prometheus scrape metrics from the various pods.
>
>
> The jobs are created successfully, however, the metrics don't appear to be
> available.
>
>
> The following steps were followed based on the documentation:
> https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.0/docs/operations/metrics-logging/#how-to-enable-prometheus-example
>
>
>
> - The Prometheus stack is deployed successfully
> - The Pod Monitor is enabled successfully
> - The Flink Kubernetes operator is successfully created with following
> configuration is appended to the values.yaml
>
>
> metrics:
> port: 9999
>
> defaultConfiguration:
> create: true
> append: true
> flink-conf.yaml: |+
> kubernetes.operator.metrics.reporter.prom.class:
> org.apache.flink.metrics.prometheus.PrometheusReporter
> kubernetes.operator.metrics.reporter.prom.port: 9999
>
>
>
> - The job is deployed with the following added to
> the flinkConfiguration
>
>
> "metrics.reporter.prom.class": "org.apache.flink.metrics.prometheus.PrometheusReporter",
> "metrics.reporter.prom.port": "9999"
>
>
> Now on investigation, it does not appear that the metrics port 9999 is
> opened on the containers. The documentation is not very clear where to put
> the port config, but i assumed this what they meant as above. Is this
> correct? Is there another way of opening ports if this is not functioning
> as intended?
>
>
> Thanks in advance!
>
>
> Regards,
>
> M.
>
>
>
>