You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Chao Shi (JIRA)" <ji...@apache.org> on 2013/03/03 11:11:12 UTC

[jira] [Created] (CRUNCH-172) Refine synchronization mechanism in CrunchJobControl

Chao Shi created CRUNCH-172:
-------------------------------

             Summary: Refine synchronization mechanism in CrunchJobControl
                 Key: CRUNCH-172
                 URL: https://issues.apache.org/jira/browse/CRUNCH-172
             Project: Crunch
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.6.0
            Reporter: Chao Shi
            Assignee: Josh Wills


Currently CrunchJobControl uses a runnerState to synchronize its background loop and client calls (e.g. stop). This is not sufficient. Jenkins reports a failure after CRUNCH-156 is checked in.


MRExecutor does the following in its monitorLoop:
{code}
      Thread controlThread = new Thread(control);
      controlThread.start();
      while (killSignal.getCount() > 0 && !control.allFinished()) {
        killSignal.await(1, TimeUnit.SECONDS);
      }
      control.stop();
{code}

And how CrunchJobControl works:
{code}
  public void stop() {
    this.runnerState = ThreadState.STOPPING;
  }

  public void run() {
    this.runnerState = ThreadState.RUNNING;
    while (true) {
    ...
  }
{code}

So it is possible to have stop() called before run() called in the other thread. Then MRExecutor thinks everything has been stopped and start to do clean up work, while CrunchJobControl is continue to submit new jobs. Because the clean up work is done, the newly submitted job will complain FileNotFound.

I think a solution is to remove background thread in CrunchJobControl and let MRExecutor to call it periodically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira