You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Michael Cayanan (Created) (JIRA)" <ji...@apache.org> on 2012/04/11 17:13:18 UTC

[jira] [Created] (OODT-439) Jobs Are Dropped When Queue Size Is Full

Jobs Are Dropped When Queue Size Is Full
----------------------------------------

                 Key: OODT-439
                 URL: https://issues.apache.org/jira/browse/OODT-439
             Project: OODT
          Issue Type: Bug
          Components: resource manager
    Affects Versions: 0.3, 0.4
            Reporter: Michael Cayanan
            Priority: Minor
             Fix For: 0.5


When the queue has reached the max queue size, a message is logged by the Scheduler saying there is a Job Queue Exception adding a job to the queue, and then the job is dropped.

The following case will produce the issue:

1) queue is full
2) scheduler pops job from queue and begins trying to find a node for job
3) queue now has 1 open slot
4) another job is given to the resource manager and is placed in the queue
5) queue is now full again
6) scheduler fails to schedule popped job
7) scheduler pushes job back into the queue
8) queue is full so exception is thrown and job is lost

The job needs to get re-queued in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Created] (OODT-439) Jobs Are Dropped When Queue Size Is Full

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks Mike!

On Apr 11, 2012, at 8:13 AM, Michael Cayanan (Created) (JIRA) wrote:

> Jobs Are Dropped When Queue Size Is Full
> ----------------------------------------
> 
>                 Key: OODT-439
>                 URL: https://issues.apache.org/jira/browse/OODT-439
>             Project: OODT
>          Issue Type: Bug
>          Components: resource manager
>    Affects Versions: 0.3, 0.4
>            Reporter: Michael Cayanan
>            Priority: Minor
>             Fix For: 0.5
> 
> 
> When the queue has reached the max queue size, a message is logged by the Scheduler saying there is a Job Queue Exception adding a job to the queue, and then the job is dropped.
> 
> The following case will produce the issue:
> 
> 1) queue is full
> 2) scheduler pops job from queue and begins trying to find a node for job
> 3) queue now has 1 open slot
> 4) another job is given to the resource manager and is placed in the queue
> 5) queue is now full again
> 6) scheduler fails to schedule popped job
> 7) scheduler pushes job back into the queue
> 8) queue is full so exception is thrown and job is lost
> 
> The job needs to get re-queued in this case.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


[jira] [Commented] (OODT-439) Jobs Are Dropped When Queue Size Is Full

Posted by "Michael Cayanan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OODT-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251667#comment-13251667 ] 

Michael Cayanan commented on OODT-439:
--------------------------------------

The current work around is to set the resource manager's queue size to be as big as the maximum amount of jobs that can be sent by the Workflow/Wengine at one time.
                
> Jobs Are Dropped When Queue Size Is Full
> ----------------------------------------
>
>                 Key: OODT-439
>                 URL: https://issues.apache.org/jira/browse/OODT-439
>             Project: OODT
>          Issue Type: Bug
>          Components: resource manager
>    Affects Versions: 0.3, 0.4
>            Reporter: Michael Cayanan
>            Priority: Minor
>             Fix For: 0.5
>
>
> When the queue has reached the max queue size, a message is logged by the Scheduler saying there is a Job Queue Exception adding a job to the queue, and then the job is dropped.
> The following case will produce the issue:
> 1) queue is full
> 2) scheduler pops job from queue and begins trying to find a node for job
> 3) queue now has 1 open slot
> 4) another job is given to the resource manager and is placed in the queue
> 5) queue is now full again
> 6) scheduler fails to schedule popped job
> 7) scheduler pushes job back into the queue
> 8) queue is full so exception is thrown and job is lost
> The job needs to get re-queued in this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira