You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by wa...@apache.org on 2022/03/04 03:14:31 UTC

[flink-kubernetes-operator] 02/02: [FLINK-26257] Document metrics configuration for Prometheus

This is an automated email from the ASF dual-hosted git repository.

wangyang0918 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git

commit f25960db293c729196c0dd29a13d032cbf2a1890
Author: Matyas Orhidi <ma...@apple.com>
AuthorDate: Mon Feb 28 18:48:58 2022 +0100

    [FLINK-26257] Document metrics configuration for Prometheus
    
    This closes #33.
---
 README.md                                         | 59 +++++++++++++++++++++++
 helm/flink-operator/templates/flink-operator.yaml |  6 +++
 helm/flink-operator/values.yaml                   |  4 ++
 3 files changed, 69 insertions(+)

diff --git a/README.md b/README.md
index 9ab03ae..2baad7e 100644
--- a/README.md
+++ b/README.md
@@ -84,3 +84,62 @@ Considering the cost of running the builds, the stability, and the maintainabili
 All the unit tests, integration tests, and the end-to-end tests will be triggered for each PR.
 
 Note: Please make sure the CI passed before merging.
+
+## Operator Metrics
+
+The operator extends the [Flink Metric System](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/) that allows gathering and exposing metrics to centralized monitoring solutions. The well known [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters) are shipped in the operator image and are ready to use.
+
+### Slf4j
+The default metrics reporter in the operator is Slf4j. It does not require any external monitoring systems, and it is enabled in the operator [Helm chart](helm/flink-operator/templates/flink-operator.yaml) by default, mainly for demonstrating purposes.
+```properties
+metrics.reporter.slf4j.factory.class: org.apache.flink.metrics.slf4j.Slf4jReporterFactory
+metrics.reporter.slf4j.interval: 1 MINUTE
+```
+To use a more robust production grade monitoring solution the configuration needs to be changed.
+
+### Prometheus
+The following example shows how to enable the Prometheus metric reporter:
+```properties
+metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
+metrics.reporter.prom.port: 9999
+```
+Some metric reporters, including the Prometheus, needs a port to be exposed on the container. This can be achieved be defining a value for the otherwise empty `metrics.port` variable.
+Either in the [values.yaml](helm/flink-operator/values.yaml) file:
+```yaml
+metrics:
+  port: 9999
+```
+or using the option `--set metrics.port=9999` in the command line.
+
+The Prometheus Operator among other options provides an elegant, declarative way to specify how group of pods should be monitored using custom resources.
+
+To install the Prometheus operator via Helm run:
+
+```shell
+helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+helm install prometheus prometheus-community/kube-prometheus-stack
+```
+The Grafana dashboard can be accessed through port-forwarding:
+```shell
+kubectl port-forward deployment/prometheus-grafana 3000
+```
+To enable the operator metrics in Prometheus create a `pod-monitor.yaml` file with the following content:
+```yaml
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+  name: flink-operator
+  labels:
+    release: prometheus
+spec:
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: flink-operator
+  podMetricsEndpoints:
+      - port: metrics
+```
+and apply it on your Kubernetes environment:
+```shell
+kubectl create -f pod-monitor.yaml
+```
+Once the custom resource is created in the Kubernetes environment the operator metrics are ready to explore [http://localhost:3000/explore](http://localhost:3000/explore).
diff --git a/helm/flink-operator/templates/flink-operator.yaml b/helm/flink-operator/templates/flink-operator.yaml
index 8c982a4..cc3db68 100644
--- a/helm/flink-operator/templates/flink-operator.yaml
+++ b/helm/flink-operator/templates/flink-operator.yaml
@@ -46,6 +46,12 @@ spec:
           image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
           imagePullPolicy: {{ .Values.image.pullPolicy }}
           command: ["/docker-entrypoint.sh", "operator"]
+          {{- if .Values.metrics.port }}
+          ports:
+            - containerPort: {{ .Values.metrics.port }}
+              name: metrics
+              protocol: TCP
+          {{- end }}
           env:
             - name: OPERATOR_NAMESPACE
               value: {{ .Values.operatorNamespace.name }}
diff --git a/helm/flink-operator/values.yaml b/helm/flink-operator/values.yaml
index 51c059f..854a36f 100644
--- a/helm/flink-operator/values.yaml
+++ b/helm/flink-operator/values.yaml
@@ -52,3 +52,7 @@ webhook:
 imagePullSecrets: []
 nameOverride: ""
 fullnameOverride: ""
+
+# (Optional) Exposes metrics port on the container if defined
+metrics:
+  port: