You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/07/16 15:55:19 UTC
[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
gyfora opened a new pull request, #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320
## What is the purpose of the change
Fix a few smaller bugs/issues with the JOSDK metrics reporting logic that was added recently.
## Brief change log
- *Use correct class for FlinkSessionJobController to avoid nullpointer exception*
- *Avoid creating metric groups again and again for counters and histograms*
- *Remove unnecessary synchronization for built in histogram*
## Verifying this change
This change is already covered by existing tests
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`: no
- Core observer or reconciler logic that is regularly executed: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
gyfora commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1186225738
cc @morhidi @SteNicholas
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora merged pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
gyfora merged PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1186838087
Hi @gyfora I quickly tested the PR locally and found some seemingly inconsistent counters. For example these are the JOSDK counters after the operator restart with a deployed session job. I'll dig into this later today
```
-- Counters -------------------------------------------------------------------
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.ADDED.Count: 3
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.Count: 54
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.Count: 3
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.finished.Count: 54
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187151265
Looks much better now:
```
-- Counters -------------------------------------------------------------------
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.Count: 18
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.finished.Count: 18
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.cleanup.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.Count: 4
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.ADDED.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.UPDATED.Count: 2
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.DELETED.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.Count: 15
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.finished.Count: 15
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.cleanup.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.Count: 4
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.DELETED.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.ADDED.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.UPDATED.Count: 2
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.Count: 14
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.finished.Count: 14
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.cleanup.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.Count: 4
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.UPDATED.Count: 2
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.ADDED.Count: 1
localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.DELETED.Count: 1
-- Histograms -----------------------------------------------------------------
localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.reconcile.resource.TimeSeconds: count=14, min=0, max=1, mean=0.07142857142857144, stddev=0.2672612419124244, p50=0.0, p75=0.0, p95=1.0, p98=1.0, p99=1.0, p999=1.0
localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.cleanup.delete.TimeSeconds: count=1, min=3, max=3, mean=3.0, stddev=0.0, p50=3.0, p75=3.0, p95=3.0, p98=3.0, p99=3.0, p999=3.0
localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.cleanup.finalizerNotRemoved.TimeSeconds: count=2, min=0, max=0, mean=0.0, stddev=0.0, p50=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0
localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkSessionJob.reconcile.resource.TimeSeconds: count=25, min=0, max=2, mean=0.15999999999999998, stddev=0.5537749241945382, p50=0.0, p75=0.0, p95=2.0, p98=2.0, p99=2.0, p999=2.0
localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkSessionJob.cleanup.delete.TimeSeconds: count=2, min=0, max=0, mean=0.0, stddev=0.0, p50=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0
=========================== Finished metrics report ===========================
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187162316
We can also consider implementing the CR counters as actual counters not gauges, and add an option to turn them on/off. But this might be out of scope of this PR :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187156336
Thanks for adding an option to turn the JVM metrics off
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time
Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187172679
+1 LGTM
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org