You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/12/02 19:31:11 UTC

[GitHub] [flink-kubernetes-operator] stevenpyzhang opened a new pull request, #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

stevenpyzhang opened a new pull request, #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468

   ## What is the purpose of the change
   
   Delete the Job manager resources first when deleting a standalone Flink cluster as part of an upgrade in order to prevent HA configmaps from getting cleaned up. As described in the ticket, I was seeing job HA data getting deleted between cluster upgrades.
   
   ## Brief change log
   
   Changed the resource clean up order from TM then JM to JM then TM for standalone clusters.
   
   ## Verifying this change
   
   Manually verified when upgrading a standalone session cluster with a SessionJob running on it, configmaps didn't get cleaned up after switching operator to deleting JM resources first.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: no
     - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi merged pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
morhidi merged PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] stevenpyzhang commented on pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
stevenpyzhang commented on PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#issuecomment-1340148204

   Thanks for the clarification, I've opened another pr to add a unit test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
gyfora commented on code in PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#discussion_r1040194289


##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/StandaloneFlinkService.java:
##########
@@ -134,20 +134,20 @@ protected void deleteClusterInternal(ObjectMeta meta, boolean deleteHaConfigmaps
         final String clusterId = meta.getName();
         final String namespace = meta.getNamespace();
 
-        LOG.info("Deleting Flink Standalone cluster TM resources");
+        LOG.info("Deleting Flink Standalone cluster JM resources");
         kubernetesClient
                 .apps()
                 .deployments()
                 .inNamespace(namespace)
-                .withName(StandaloneKubernetesUtils.getTaskManagerDeploymentName(clusterId))
+                .withName(StandaloneKubernetesUtils.getJobManagerDeploymentName(clusterId))
                 .delete();
 
-        LOG.info("Deleting Flink Standalone cluster JM resources");
+        LOG.info("Deleting Flink Standalone cluster TM resources");

Review Comment:
   Do we need to wait until the JM is deleted before starting the TM deletion?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
gyfora commented on PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#issuecomment-1339950211

   Mockito in general is against the main Flink testing guidelines. In this case we could have probably used the Kubernetes Mockwebserver to enforce the calls and test the order .
   
   could you check that @stevenpyzhang ? If we could have a simple test that would be great 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
morhidi commented on PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#issuecomment-1337983383

   cc @usamj WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] stevenpyzhang commented on pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
stevenpyzhang commented on PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#issuecomment-1339915223

   @usamj @morhidi @gyfora thanks for reviewing the pr and helping merge it in. Were there any thoughts on including a more advanced testing framework into Operator such as Mockito? I was thinking about adding a unit test to enforce the new clean up order, but I didn't see a way of doing it with just Junit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] usamj commented on a diff in pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
usamj commented on code in PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#discussion_r1040209873


##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/StandaloneFlinkService.java:
##########
@@ -134,20 +134,20 @@ protected void deleteClusterInternal(ObjectMeta meta, boolean deleteHaConfigmaps
         final String clusterId = meta.getName();
         final String namespace = meta.getNamespace();
 
-        LOG.info("Deleting Flink Standalone cluster TM resources");
+        LOG.info("Deleting Flink Standalone cluster JM resources");
         kubernetesClient
                 .apps()
                 .deployments()
                 .inNamespace(namespace)
-                .withName(StandaloneKubernetesUtils.getTaskManagerDeploymentName(clusterId))
+                .withName(StandaloneKubernetesUtils.getJobManagerDeploymentName(clusterId))
                 .delete();
 
-        LOG.info("Deleting Flink Standalone cluster JM resources");
+        LOG.info("Deleting Flink Standalone cluster TM resources");

Review Comment:
   I don't think that is needed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink-kubernetes-operator] usamj commented on pull request #468: [FLINK-30287] Delete job manager deployment first when cleaning up standalone cluster

Posted by GitBox <gi...@apache.org>.
usamj commented on PR #468:
URL: https://github.com/apache/flink-kubernetes-operator/pull/468#issuecomment-1338337905

   This seems like a reasonable change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org