You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by wa...@apache.org on 2022/03/04 03:14:31 UTC
[flink-kubernetes-operator] 02/02: [FLINK-26257] Document metrics configuration for Prometheus
This is an automated email from the ASF dual-hosted git repository.
wangyang0918 pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/flink-kubernetes-operator.git
commit f25960db293c729196c0dd29a13d032cbf2a1890
Author: Matyas Orhidi <ma...@apple.com>
AuthorDate: Mon Feb 28 18:48:58 2022 +0100
[FLINK-26257] Document metrics configuration for Prometheus
This closes #33.
---
README.md | 59 +++++++++++++++++++++++
helm/flink-operator/templates/flink-operator.yaml | 6 +++
helm/flink-operator/values.yaml | 4 ++
3 files changed, 69 insertions(+)
diff --git a/README.md b/README.md
index 9ab03ae..2baad7e 100644
--- a/README.md
+++ b/README.md
@@ -84,3 +84,62 @@ Considering the cost of running the builds, the stability, and the maintainabili
All the unit tests, integration tests, and the end-to-end tests will be triggered for each PR.
Note: Please make sure the CI passed before merging.
+
+## Operator Metrics
+
+The operator extends the [Flink Metric System](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/) that allows gathering and exposing metrics to centralized monitoring solutions. The well known [Metric Reporters](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/metric_reporters) are shipped in the operator image and are ready to use.
+
+### Slf4j
+The default metrics reporter in the operator is Slf4j. It does not require any external monitoring systems, and it is enabled in the operator [Helm chart](helm/flink-operator/templates/flink-operator.yaml) by default, mainly for demonstrating purposes.
+```properties
+metrics.reporter.slf4j.factory.class: org.apache.flink.metrics.slf4j.Slf4jReporterFactory
+metrics.reporter.slf4j.interval: 1 MINUTE
+```
+To use a more robust production grade monitoring solution the configuration needs to be changed.
+
+### Prometheus
+The following example shows how to enable the Prometheus metric reporter:
+```properties
+metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
+metrics.reporter.prom.port: 9999
+```
+Some metric reporters, including the Prometheus, needs a port to be exposed on the container. This can be achieved be defining a value for the otherwise empty `metrics.port` variable.
+Either in the [values.yaml](helm/flink-operator/values.yaml) file:
+```yaml
+metrics:
+ port: 9999
+```
+or using the option `--set metrics.port=9999` in the command line.
+
+The Prometheus Operator among other options provides an elegant, declarative way to specify how group of pods should be monitored using custom resources.
+
+To install the Prometheus operator via Helm run:
+
+```shell
+helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+helm install prometheus prometheus-community/kube-prometheus-stack
+```
+The Grafana dashboard can be accessed through port-forwarding:
+```shell
+kubectl port-forward deployment/prometheus-grafana 3000
+```
+To enable the operator metrics in Prometheus create a `pod-monitor.yaml` file with the following content:
+```yaml
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: flink-operator
+ labels:
+ release: prometheus
+spec:
+ selector:
+ matchLabels:
+ app.kubernetes.io/name: flink-operator
+ podMetricsEndpoints:
+ - port: metrics
+```
+and apply it on your Kubernetes environment:
+```shell
+kubectl create -f pod-monitor.yaml
+```
+Once the custom resource is created in the Kubernetes environment the operator metrics are ready to explore [http://localhost:3000/explore](http://localhost:3000/explore).
diff --git a/helm/flink-operator/templates/flink-operator.yaml b/helm/flink-operator/templates/flink-operator.yaml
index 8c982a4..cc3db68 100644
--- a/helm/flink-operator/templates/flink-operator.yaml
+++ b/helm/flink-operator/templates/flink-operator.yaml
@@ -46,6 +46,12 @@ spec:
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
command: ["/docker-entrypoint.sh", "operator"]
+ {{- if .Values.metrics.port }}
+ ports:
+ - containerPort: {{ .Values.metrics.port }}
+ name: metrics
+ protocol: TCP
+ {{- end }}
env:
- name: OPERATOR_NAMESPACE
value: {{ .Values.operatorNamespace.name }}
diff --git a/helm/flink-operator/values.yaml b/helm/flink-operator/values.yaml
index 51c059f..854a36f 100644
--- a/helm/flink-operator/values.yaml
+++ b/helm/flink-operator/values.yaml
@@ -52,3 +52,7 @@ webhook:
imagePullSecrets: []
nameOverride: ""
fullnameOverride: ""
+
+# (Optional) Exposes metrics port on the container if defined
+metrics:
+ port: