You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "meethngala (via GitHub)" <gi...@apache.org> on 2023/02/08 02:53:16 UTC

[GitHub] [gobblin] meethngala opened a new pull request, #3639: [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses

meethngala opened a new pull request, #3639:
URL: https://github.com/apache/gobblin/pull/3639

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-XXX
       - https://issues.apache.org/jira/browse/GOBBLIN-1782
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if applicable):
   - [ ] Flow statuses in pending_resume state were not getting executed since we incorrectly update the merge state for such flows
   - [ ] Thus, we now detect such flows and update their merge state so that they can proceed to their next state in the DAG
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   - [ ] Added unit test for the fix
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3639: [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses

Posted by "Will-Lo (via GitHub)" <gi...@apache.org>.
Will-Lo commented on code in PR #3639:
URL: https://github.com/apache/gobblin/pull/3639#discussion_r1100616376


##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java:
##########
@@ -251,10 +251,17 @@ static void addJobStatusToStateStore(org.apache.gobblin.configuration.State jobS
         int currentGeneration = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_GENERATION_FIELD, previousGeneration);
         int previousAttempts = previousJobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, 1);
         int currentAttempts = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, previousAttempts);
+        // Verify if the current job status is flow status. If yes, we check for its current execution status to be PENDING_RESUME (limiting to just resume flow statuses)
+        // When the above two conditions satisfy, we NEED NOT check for the out-of-order events since GaaS would manage the lifecycle of these events
+        // Hence, we update the merge state so that the flow can proceed with its execution
+        if (jobName != null && jobGroup != null
+            && jobName.equals(JobStatusRetriever.NA_KEY) && jobGroup.equals(JobStatusRetriever.NA_KEY) && currentStatus.equals(ExecutionStatus.PENDING_RESUME.name())) {

Review Comment:
   can we abstract this check into a function similar to how JobStatusRetriever has this check?



##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java:
##########
@@ -251,10 +251,17 @@ static void addJobStatusToStateStore(org.apache.gobblin.configuration.State jobS
         int currentGeneration = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_GENERATION_FIELD, previousGeneration);
         int previousAttempts = previousJobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, 1);
         int currentAttempts = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, previousAttempts);
+        // Verify if the current job status is flow status. If yes, we check for its current execution status to be PENDING_RESUME (limiting to just resume flow statuses)
+        // When the above two conditions satisfy, we NEED NOT check for the out-of-order events since GaaS would manage the lifecycle of these events
+        // Hence, we update the merge state so that the flow can proceed with its execution
+        if (jobName != null && jobGroup != null
+            && jobName.equals(JobStatusRetriever.NA_KEY) && jobGroup.equals(JobStatusRetriever.NA_KEY) && currentStatus.equals(ExecutionStatus.PENDING_RESUME.name())) {

Review Comment:
   Also, since the side effect of this change is similar to the else case, would it be simpler for us to append to the previous if statement and make an initial check that the currentStatus != a flow status and pending resume?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] codecov-commenter commented on pull request #3639: [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #3639:
URL: https://github.com/apache/gobblin/pull/3639#issuecomment-1421916971

   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3639](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6c258c5) into [master](https://codecov.io/gh/apache/gobblin/commit/69f3f9c33c9679d723c447fbe626db8f1b7afa9e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (69f3f9c) will **decrease** coverage by `2.79%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3639      +/-   ##
   ============================================
   - Coverage     46.58%   43.80%   -2.79%     
   + Complexity    10672     2064    -8608     
   ============================================
     Files          2133      409    -1724     
     Lines         83557    17639   -65918     
     Branches       9290     2152    -7138     
   ============================================
   - Hits          38928     7726   -31202     
   + Misses        41068     9055   -32013     
   + Partials       3561      858    -2703     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `62.22% <0.00%> (-0.31%)` | :arrow_down: |
   | [...blin/service/monitoring/KafkaJobStatusMonitor.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9uaXRvcmluZy9LYWZrYUpvYlN0YXR1c01vbml0b3IuamF2YQ==) | | |
   | [...aset/comparators/URNLexicographicalComparator.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YXNldC9jb21wYXJhdG9ycy9VUk5MZXhpY29ncmFwaGljYWxDb21wYXJhdG9yLmphdmE=) | | |
   | [...iance/retention/HivePartitionRetentionVersion.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tY29tcGxpYW5jZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9jb21wbGlhbmNlL3JldGVudGlvbi9IaXZlUGFydGl0aW9uUmV0ZW50aW9uVmVyc2lvbi5qYXZh) | | |
   | [...mapreduce/orc/GobblinOrcMapreduceRecordWriter.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vbWFwcmVkdWNlL29yYy9Hb2JibGluT3JjTWFwcmVkdWNlUmVjb3JkV3JpdGVyLmphdmE=) | | |
   | [...ent/retention/action/MultiAccessControlAction.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi9hY3Rpb24vTXVsdGlBY2Nlc3NDb250cm9sQWN0aW9uLmphdmE=) | | |
   | [...r/ProduceRateAndLagBasedWorkUnitSizeEstimator.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9rYWZrYS93b3JrdW5pdC9wYWNrZXIvUHJvZHVjZVJhdGVBbmRMYWdCYXNlZFdvcmtVbml0U2l6ZUVzdGltYXRvci5qYXZh) | | |
   | [...urce/extractor/extract/restapi/RestApiCommand.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NvdXJjZS9leHRyYWN0b3IvZXh0cmFjdC9yZXN0YXBpL1Jlc3RBcGlDb21tYW5kLmphdmE=) | | |
   | [...il/limiter/stressTest/FixedOperationsStressor.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UtY2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvbGltaXRlci9zdHJlc3NUZXN0L0ZpeGVkT3BlcmF0aW9uc1N0cmVzc29yLmphdmE=) | | |
   | [...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==) | | |
   | ... and [1716 more](https://codecov.io/gh/apache/gobblin/pull/3639?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] meethngala commented on a diff in pull request #3639: [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses

Posted by "meethngala (via GitHub)" <gi...@apache.org>.
meethngala commented on code in PR #3639:
URL: https://github.com/apache/gobblin/pull/3639#discussion_r1100689394


##########
gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java:
##########
@@ -251,10 +251,17 @@ static void addJobStatusToStateStore(org.apache.gobblin.configuration.State jobS
         int currentGeneration = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_GENERATION_FIELD, previousGeneration);
         int previousAttempts = previousJobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, 1);
         int currentAttempts = jobStatus.getPropAsInt(TimingEvent.FlowEventConstants.CURRENT_ATTEMPTS_FIELD, previousAttempts);
+        // Verify if the current job status is flow status. If yes, we check for its current execution status to be PENDING_RESUME (limiting to just resume flow statuses)
+        // When the above two conditions satisfy, we NEED NOT check for the out-of-order events since GaaS would manage the lifecycle of these events
+        // Hence, we update the merge state so that the flow can proceed with its execution
+        if (jobName != null && jobGroup != null
+            && jobName.equals(JobStatusRetriever.NA_KEY) && jobGroup.equals(JobStatusRetriever.NA_KEY) && currentStatus.equals(ExecutionStatus.PENDING_RESUME.name())) {

Review Comment:
   yeah for sure! I have abstracted the logic to check for flow statuses and in pending resume in my latest commit. Regarding the second comment, I have added the logic for currentStatus != pending resume in the abstracted method itself... so that in case we want to extend it to other status types in the future we can just do it in the method and not change the if condition often... wdyt?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo merged pull request #3639: [GOBBLIN-1782] Fix Merge State for Flow Pending Resume statuses

Posted by "Will-Lo (via GitHub)" <gi...@apache.org>.
Will-Lo merged PR #3639:
URL: https://github.com/apache/gobblin/pull/3639


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org