You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@helix.apache.org by GitBox <gi...@apache.org> on 2020/01/24 00:03:18 UTC

[GitHub] [helix] alirezazamani opened a new pull request #705: The workflow garbage collection is added

alirezazamani opened a new pull request #705: The workflow garbage collection is added
URL: https://github.com/apache/helix/pull/705
 
 
   ### Issues
   - [x] My PR addresses the following Helix issues and references them in the PR title:
   Fixes #704 
   
   ### Description
   - [x] Here are some details about my PR, including screenshots of any UI changes:
   This PR contains the implementation of workflow context garbage collection which is added to scan all of the workflow context, and delete them if workflow config does not exists.
   Test is added which checks the functionality of workflow garbage collection.
   
   
   ### Tests
   
   - [x] The following is the result of the "mvn test" command on the appropriate module:
   New tests have been added.
   TestWorkflowContextWithoutConfig.TestWorkflowContextGarbageCollection
   
   Test Result 1: helix-core: mvn test
   [INFO] Tests run: 898, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3,572.315 s - in TestSuite
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 898, Failures: 0, Errors: 0, Skipped: 0
   [INFO] 
   [INFO] ------------------------------------------------------------------------
   [INFO] BUILD SUCCESS
   [INFO] ------------------------------------------------------------------------
   [INFO] Total time:  59:37 min
   [INFO] Finished at: 2020-01-23T12:47:31-08:00
   [INFO] ------------------------------------------------------------------------
   
   ### Commits
   
   - [x] My commits all reference appropriate Apache Helix GitHub issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Code Quality
   - [x] My diff has been formatted using helix-style.xml
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r371064701
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
 
 Review comment:
   Since controller attempts to delete the workflow context after the successful deletion of the workflow config and IdealState, there is no need to consider cases where IS existed without contexts/configs. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r371063896
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
 
 Review comment:
   Fixed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on issue #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on issue #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#issuecomment-581703091
 
 
   This PR is ready to be merged, approved by @narendly .
   
   Title: 
   Add workflow garbage collector
   
   Body:
   The workflow garbage collection is added which scans all the workflow context,
   and delete them if workflow config does not exists.
   Test is added which checks the functionality of workflow garbage collection.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r373732379
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
 
 Review comment:
   Let's consider two scenarios: 
   
   - Case 1 where workflow config has been deleted but workflow context and IS have not been removed. In this case this PR addresses this scenario and deletes both of them. Note that in this PR context won't be removed unless IS is removed. If IS has not been successfully removed, we will try to delete it again in next pipeline.
   
   - Case 2 where workflow config and context have been removed and IS has not been removed. I am arguing that this is not possible since context is always the last one that controller removes. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r371064824
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
+      final HelixManager manager) {
+    // Garbage collections for conditions where workflow context is existed but config is missing.
 
 Review comment:
   fixed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r370501499
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
+      final HelixManager manager) {
+    // Garbage collections for conditions where workflow context is existed but config is missing.
+    Map<String, ZNRecord> contexts = dataProvider.getContexts();
+    for (Map.Entry<String, ZNRecord> entry : contexts.entrySet()) {
 
 Review comment:
   
   
   When you do this, you'd want to take a snapshot of the entries because this might cause a ConcurrentModificationException.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r371064935
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
+      final HelixManager manager) {
+    // Garbage collections for conditions where workflow context is existed but config is missing.
+    Map<String, ZNRecord> contexts = dataProvider.getContexts();
+    for (Map.Entry<String, ZNRecord> entry : contexts.entrySet()) {
 
 Review comment:
   Very good point. Fixed. Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly merged pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly merged pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r370501381
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
 
 Review comment:
   
   
   "workflow context of the workflow whose workflow config does not exist"
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r373732379
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
 
 Review comment:
   Let's consider two scenarios: 
   
   - Case 1 where workflow config has been deleted but workflow context and IS have not been removed. In this case this PR addresses this scenario and deletes both of them. Note that in this PR context won't be removed unless IS is removed. If IS has not been successfully removed, we will try to delete it again in next pipeline.
   
   - Case 2 where workflow config and context have been removed and IS has not been removed. I am arguing that this is not possible since context is always the last one that controller removes. 
   
   PS: I don't know any way that controller can find all of the IdealStates that are associated to workflows and not the regular resources (we can do that for context based on the Znode ID). If we have such method, we can also loop through the IdealStates in addition to contexts.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r370501464
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
+      final HelixManager manager) {
+    // Garbage collections for conditions where workflow context is existed but config is missing.
 
 Review comment:
   
   
   is existed -> exists
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r373682337
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
 
 Review comment:
   How could be sure that both WorkflowConfig and IS were deleted successfully? Is it possible that only config was removed and IS didn't get removed?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] alirezazamani commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
alirezazamani commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r371063962
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
 
 Review comment:
   Done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r370501424
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
+   * Workflow Context if Workflow Config is missing.
+   * @param dataProvider
+   * @param manager
+   */
+  public static void workflowGarbageCollection(WorkflowControllerDataProvider dataProvider,
 
 Review comment:
   
   
   Here we are doing a loop based on workflow contexts. Could there be cases where IdealStates exist but contexts/configs do not?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org


[GitHub] [helix] narendly commented on a change in pull request #705: Add workflow context garbage collection

Posted by GitBox <gi...@apache.org>.
narendly commented on a change in pull request #705: Add workflow context garbage collection 
URL: https://github.com/apache/helix/pull/705#discussion_r370501350
 
 

 ##########
 File path: helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
 ##########
 @@ -1036,6 +1037,46 @@ public static void purgeExpiredJobs(String workflow, WorkflowConfig workflowConf
     setNextJobPurgeTime(workflow, currentTime, purgeInterval, rebalanceScheduler, manager);
   }
 
+  /**
+   * The function that loops through the all existed workflow contexts and removes IdealState and
 
 Review comment:
   
   
   existed -> existing
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@helix.apache.org
For additional commands, e-mail: reviews-help@helix.apache.org