You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Chao Shi (JIRA)" <ji...@apache.org> on 2013/03/10 17:05:12 UTC

[jira] [Updated] (CRUNCH-172) Refine synchronization mechanism in CrunchJobControl

     [ https://issues.apache.org/jira/browse/CRUNCH-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chao Shi updated CRUNCH-172:
----------------------------

    Attachment: crunch-172.patch

Remove background thread from CrunchJobControl and let it called by the monitor thread in MRExecutor
    
Beside this, there are some small changes:
- Use exponential backoff when query job status. This makes local integration tests run much faster on hadoop2.
- Remove suspend/resume support, because it is currently not used and makes synchronization complex.

                
> Refine synchronization mechanism in CrunchJobControl
> ----------------------------------------------------
>
>                 Key: CRUNCH-172
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-172
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.0
>            Reporter: Chao Shi
>            Assignee: Josh Wills
>         Attachments: crunch-172.patch
>
>
> Currently CrunchJobControl uses a runnerState to synchronize its background loop and client calls (e.g. stop). This is not sufficient. Jenkins reports a failure after CRUNCH-156 is checked in.
> MRExecutor does the following in its monitorLoop:
> {code}
>       Thread controlThread = new Thread(control);
>       controlThread.start();
>       while (killSignal.getCount() > 0 && !control.allFinished()) {
>         killSignal.await(1, TimeUnit.SECONDS);
>       }
>       control.stop();
> {code}
> And how CrunchJobControl works:
> {code}
>   public void stop() {
>     this.runnerState = ThreadState.STOPPING;
>   }
>   public void run() {
>     this.runnerState = ThreadState.RUNNING;
>     while (true) {
>     ...
>   }
> {code}
> So it is possible to have stop() called before run() called in the other thread. Then MRExecutor thinks everything has been stopped and start to do clean up work, while CrunchJobControl is continue to submit new jobs. Because the clean up work is done, the newly submitted job will complain FileNotFound.
> I think a solution is to remove background thread in CrunchJobControl and let MRExecutor to call it periodically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira