You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by GitBox <gi...@apache.org> on 2022/03/10 21:01:21 UTC

[GitHub] [flink-kubernetes-operator] gyfora opened a new pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

gyfora opened a new pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51


   Make config names cleaner and reduce reschedule delays while the job is deploying.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
gyfora commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824423169



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       I agree, the current check is not very robust and works only in a very specific scenario. Let's open a ticket to track this effort and see if we can come up with something better :) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] tweise commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
tweise commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824373450



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       Why would this affect session cluster? There is a delay between the port becoming available and the actual rest server being ready which absent any other reliable technique to check we have the (now configurable!) delay for.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
wangyang0918 commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824388470



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       For example, if something is wrong or very slow with the leader election, `JobManagerDeploymentStatus` is `READY ` but the session cluster is not ready for accepting job submission. What I mean is to update the `JobManagerDeploymentStatus` the `READY` only when the flink cluster is actually working.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
wangyang0918 commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824388470



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       For example, if something is wrong or very slow with the leader election, `JobManagerDeploymentStatus` is `READY ` but the session cluster is not ready for accepting job submission. What I mean is to update the `JobManagerDeploymentStatus` to `READY` only when the flink cluster is actually working.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
wangyang0918 commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824359419



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       Out the scope of this PR.
   
   To be honest, I do not like the `restApiReadyDelaySeconds` configuration. This make the `JobManagerDeploymentStatus.READY` not really ready for accepting REST API calls. This will be problem if we are running session cluster.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora merged pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
gyfora merged pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
gyfora commented on pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#issuecomment-1064504062


   cc @tweise @wangyang0918 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a change in pull request #51: [FLINK-26572] Improve reconcile reschedule configs and defaults

Posted by GitBox <gi...@apache.org>.
wangyang0918 commented on a change in pull request #51:
URL: https://github.com/apache/flink-kubernetes-operator/pull/51#discussion_r824573440



##########
File path: flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkOperatorConfiguration.java
##########
@@ -26,26 +26,33 @@
 @Value
 public class FlinkOperatorConfiguration {
 
-    int reconcileIntervalInSec;
-
-    int portCheckIntervalInSec;
-
-    int savepointTriggerGracePeriodInSec;
+    int reconcileIntervalSeconds;
+    int progressCheckIntervalSeconds;
+    int restApiReadyDelaySeconds;
+    int savepointTriggerGracePeriodSeconds;
 
     public static FlinkOperatorConfiguration fromConfiguration(Configuration operatorConfig) {
-        int reconcileIntervalInSec =
+        int reconcileIntervalSeconds =
                 operatorConfig.getInteger(
                         OperatorConfigOptions.OPERATOR_RECONCILER_RESCHEDULE_INTERVAL_IN_SEC);
-        int portCheckIntervalInSec =
+
+        int restApiReadyDelaySeconds =

Review comment:
       I have created a ticket FLINK-26605 to track this effort.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org