You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "ZihanLi58 (via GitHub)" <gi...@apache.org> on 2023/02/02 23:02:48 UTC

[GitHub] [gobblin] ZihanLi58 opened a new pull request, #3635: [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table

ZihanLi58 opened a new pull request, #3635:
URL: https://github.com/apache/gobblin/pull/3635

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1778
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if applicable):
   Now dag managers have the assumption that it is the only process that can update mysql table and the in-memory state is always in sync with mysql. But we do notice that during the leader transforms period, it's possible that two dag manager can run concurrently and update the mysql db at the same time. 
   
   To address that, we need either add a lock to make sure only one dag manager is working at one time, or we need to have a housekeeping thread to periodically sync the in-memory state with the mysql table. After discussion, we choose to go with the later approach and we do have the assumption that GaaS submit jobs without specifying job.id, so jobs with same flow execution id will not share the staging dir and can be executed concurrently. 
   
   Besides that, during adding test, I figure out there will be NPE if we try to set the dag manager as inactive, add a small fix for that as well. 
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   Just add a single thread for housekeeping so will not affect other functions. Also add unit test to make sure we close the thread when de-active DagManager to avoid memory leak 
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3635: [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table

Posted by "Will-Lo (via GitHub)" <gi...@apache.org>.
Will-Lo commented on code in PR #3635:
URL: https://github.com/apache/gobblin/pull/3635#discussion_r1096309836


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -272,6 +275,9 @@ protected void startUp() {
    * @param setStatus if true, set all jobs in the dag to pending
    */
   synchronized void addDag(Dag<JobExecutionPlan> dag, boolean persist, boolean setStatus) throws IOException {
+    if (!this.isActive) {
+      return;
+    }

Review Comment:
   Would this ever cause dags to be missed? If during the process where one new dag is being added and then the leader is changed



##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -405,10 +411,15 @@ public synchronized void setActive(boolean active) {
         }
         FailedDagRetentionThread failedDagRetentionThread = new FailedDagRetentionThread(failedDagStateStore, failedDagIds, failedDagRetentionTime);
         this.scheduledExecutorPool.scheduleAtFixedRate(failedDagRetentionThread, 0, retentionPollingInterval, TimeUnit.MINUTES);
-        List<Dag<JobExecutionPlan>> dags = dagStateStore.getDags();
-        log.info("Loading " + dags.size() + " dags from dag state store");
-        for (Dag<JobExecutionPlan> dag : dags) {
-          addDag(dag, false, false);
+        loadingDagsFromDagStateStore();
+        this.houseKeepingThreadPool = Executors.newSingleThreadScheduledExecutor();
+        for (int delay = houseKeepingThreadInitialDelay; delay < 180; delay *= 2) {

Review Comment:
   make 180 a static variable named: `MAX_HOUSEKEEPING_THREAD_DELAY` ?



##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -441,6 +453,14 @@ public synchronized void setActive(boolean active) {
     }
   }
 
+  private void loadingDagsFromDagStateStore() throws IOException {

Review Comment:
   Tiny grammar nit: rename to loadDagFromDagStateStore? Since when it reads it's more declarative



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] codecov-commenter commented on pull request #3635: [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #3635:
URL: https://github.com/apache/gobblin/pull/3635#issuecomment-1414501865

   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3635](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9e80276) into [master](https://codecov.io/gh/apache/gobblin/commit/9f8ab24c8cf4bc195d23ebac0448a07ce2f91e2f?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9f8ab24) will **increase** coverage by `3.63%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3635      +/-   ##
   ============================================
   + Coverage     40.16%   43.79%   +3.63%     
   + Complexity     3544     2063    -1481     
   ============================================
     Files           791      409     -382     
     Lines         33285    17639   -15646     
     Branches       3699     2152    -1547     
   ============================================
   - Hits          13368     7725    -5643     
   + Misses        18601     9056    -9545     
   + Partials       1316      858     -458     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../gobblin/cluster/GobblinHelixTaskStateTracker.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFza1N0YXRlVHJhY2tlci5qYXZh) | `62.50% <0.00%> (-6.25%)` | :arrow_down: |
   | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `44.44% <0.00%> (-1.67%)` | :arrow_down: |
   | [...ion/google/webmaster/UrlTriePostOrderIterator.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvb2dsZS1pbmdlc3Rpb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vaW5nZXN0aW9uL2dvb2dsZS93ZWJtYXN0ZXIvVXJsVHJpZVBvc3RPcmRlckl0ZXJhdG9yLmphdmE=) | | |
   | [...che/gobblin/metrics/influxdb/InfluxDBReporter.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tbWV0cmljcy1pbmZsdXhkYi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9tZXRyaWNzL2luZmx1eGRiL0luZmx1eERCUmVwb3J0ZXIuamF2YQ==) | | |
   | [...apache/gobblin/kafka/writer/KafkaWriterHelper.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL3dyaXRlci9LYWZrYVdyaXRlckhlbHBlci5qYXZh) | | |
   | [...org/apache/gobblin/crypto/HexKeyToStringCodec.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY3J5cHRvL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NyeXB0by9IZXhLZXlUb1N0cmluZ0NvZGVjLmphdmE=) | | |
   | [...apache/gobblin/couchbase/common/TupleDocument.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY291Y2hiYXNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvdWNoYmFzZS9jb21tb24vVHVwbGVEb2N1bWVudC5qYXZh) | | |
   | [...ava/org/apache/gobblin/http/ApacheHttpRequest.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4taHR0cC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9odHRwL0FwYWNoZUh0dHBSZXF1ZXN0LmphdmE=) | | |
   | [...ecordToEncryptedSerializedRecordConverterBase.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY3J5cHRvL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9TZXJpYWxpemVkUmVjb3JkVG9FbmNyeXB0ZWRTZXJpYWxpemVkUmVjb3JkQ29udmVydGVyQmFzZS5qYXZh) | | |
   | [...extract/kafka/PreviousOffsetNotFoundException.java](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9rYWZrYS9QcmV2aW91c09mZnNldE5vdEZvdW5kRXhjZXB0aW9uLmphdmE=) | | |
   | ... and [375 more](https://codecov.io/gh/apache/gobblin/pull/3635?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] ZihanLi58 commented on a diff in pull request #3635: [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table

Posted by "ZihanLi58 (via GitHub)" <gi...@apache.org>.
ZihanLi58 commented on code in PR #3635:
URL: https://github.com/apache/gobblin/pull/3635#discussion_r1096318428


##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -272,6 +275,9 @@ protected void startUp() {
    * @param setStatus if true, set all jobs in the dag to pending
    */
   synchronized void addDag(Dag<JobExecutionPlan> dag, boolean persist, boolean setStatus) throws IOException {
+    if (!this.isActive) {
+      return;
+    }

Review Comment:
   Oh good catch, I add the check in the load dag method to avoid this scenario. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] ZihanLi58 merged pull request #3635: [GOBBLIN-1778] Add house keeping thread in DagManager to periodically sync in memory state with mysql table

Posted by "ZihanLi58 (via GitHub)" <gi...@apache.org>.
ZihanLi58 merged PR #3635:
URL: https://github.com/apache/gobblin/pull/3635


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org