You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/07/16 15:55:19 UTC

[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request, #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

gyfora opened a new pull request, #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320

   ## What is the purpose of the change
   
   Fix a few smaller bugs/issues with the JOSDK metrics reporting logic that was added recently.
   
   ## Brief change log
   
     - *Use correct class for FlinkSessionJobController to avoid nullpointer exception*
     - *Avoid creating metric groups again and again for counters and histograms*
     - *Remove unnecessary synchronization for built in histogram*
   
   ## Verifying this change
   
   This change is already covered by existing tests
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: no
     - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
gyfora commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1186225738

   cc @morhidi @SteNicholas 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
gyfora merged PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1186838087

   Hi @gyfora I quickly tested the PR locally and found some seemingly inconsistent counters. For example these are the JOSDK counters after the operator restart with a deployed session job. I'll dig into this later today
   ```
   -- Counters -------------------------------------------------------------------
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.ADDED.Count: 3
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.Count: 54
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.Count: 3
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.finished.Count: 54
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187151265

   Looks much better now:
   ```
   
   -- Counters -------------------------------------------------------------------
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.Count: 18
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.finished.Count: 18
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Reconciliation.cleanup.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.Count: 4
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.ADDED.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.UPDATED.Count: 2
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-deployment-example.JOSDK.Resource.Event.DELETED.Count: 1
   
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.Count: 15
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.finished.Count: 15
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Reconciliation.cleanup.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.Count: 4
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.DELETED.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.ADDED.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example.JOSDK.Resource.Event.UPDATED.Count: 2
   
   
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.Count: 14
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.finished.Count: 14
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Reconciliation.cleanup.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.Count: 4
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.UPDATED.Count: 2
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.ADDED.Count: 1
   localhost.k8soperator.default.flink-kubernetes-operator.resource.default.basic-session-job-example2.JOSDK.Resource.Event.DELETED.Count: 1
   
   -- Histograms -----------------------------------------------------------------
   localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.reconcile.resource.TimeSeconds: count=14, min=0, max=1, mean=0.07142857142857144, stddev=0.2672612419124244, p50=0.0, p75=0.0, p95=1.0, p98=1.0, p99=1.0, p999=1.0
   localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.cleanup.delete.TimeSeconds: count=1, min=3, max=3, mean=3.0, stddev=0.0, p50=3.0, p75=3.0, p95=3.0, p98=3.0, p99=3.0, p999=3.0
   localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkDeployment.cleanup.finalizerNotRemoved.TimeSeconds: count=2, min=0, max=0, mean=0.0, stddev=0.0, p50=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0
   localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkSessionJob.reconcile.resource.TimeSeconds: count=25, min=0, max=2, mean=0.15999999999999998, stddev=0.5537749241945382, p50=0.0, p75=0.0, p95=2.0, p98=2.0, p99=2.0, p999=2.0
   localhost.k8soperator.default.flink-kubernetes-operator.system.JOSDK.FlinkSessionJob.cleanup.delete.TimeSeconds: count=2, min=0, max=0, mean=0.0, stddev=0.0, p50=0.0, p75=0.0, p95=0.0, p98=0.0, p99=0.0, p999=0.0
   =========================== Finished metrics report ===========================
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187162316

   We can also consider implementing the CR counters as actual counters not gauges, and add an option to turn them on/off. But this might be out of scope of this PR :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187156336

   Thanks for adding an option to turn the JVM metrics off


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #320: [FLINK-27914] Fix SessionJobController name + avoid creating new metric groups all the time

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #320:
URL: https://github.com/apache/flink-kubernetes-operator/pull/320#issuecomment-1187172679

   +1 LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org