You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/03/11 23:37:00 UTC

[jira] [Work logged] (GOBBLIN-1621) Make HelixRetriggeringJobCallable emit job skip event when job is dropped due to previous job is running

     [ https://issues.apache.org/jira/browse/GOBBLIN-1621?focusedWorklogId=740331&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-740331 ]

ASF GitHub Bot logged work on GOBBLIN-1621:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Mar/22 23:36
            Start Date: 11/Mar/22 23:36
    Worklog Time Spent: 10m 
      Work Description: ZihanLi58 opened a new pull request #3478:
URL: https://github.com/apache/gobblin/pull/3478


   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1621
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if applicable):
   Now, when we enable concurrency on gobblin service but disable that on gobblin cluster, gobblin cluster manager will drop the job silently if previous job is running. And from gobblin service, since no update heard back, it will think that job is still waiting for start and then once we exceed job start sla, it will cancel the job, which in turn will cancel the previous running job. And make the long running job never finish in this case. 
   
    
   
   The solution is to emit job skip event when job get dropped, and on job status monitoring service, we listen to this event and mark the job status as cancelled in this case.
   
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   unit test
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 740331)
    Remaining Estimate: 0h
            Time Spent: 10m

> Make HelixRetriggeringJobCallable emit job skip event when job is dropped due to previous job is running
> --------------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1621
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1621
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Zihan Li
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now, when we enable concurrency on gobblin service but disable that on gobblin cluster, gobblin cluster manager will drop the job silently if previous job is running. And from gobblin service, since no update heard back, it will think that job is still waiting for start and then once we exceed job start sla, it will cancel the job, which in turn will cancel the previous running job. And make the long running job never finish in this case. 
>  
> The solution is to emit job skip event when job get dropped, and on job status monitoring service, we listen to this event and mark the job status as cancelled in this case.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)