You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ofbiz.apache.org by Mike Baschky <mb...@go-integral.com> on 2008/01/18 18:09:51 UTC

Stalled jobs

Hey All,
    I've got an odd issue I'm trying figure out. My service engine does
not appear to be running jobs anymore. It appears to be just sitting
there doing nothing (even for the standard ofbiz jobs like
purgeOldJobs). When I look into the Jobsandbox table I see several jobs
in running status but nothing is happening. These jobs are a several
days old so I'm guessing they are not really running. I've shut the
system down an restarted but still nothing seems to happen. I'm not
really seeing any error messages that help me out here. Thinking that
maybe I've hit some sort of job limit I banked out the statusId on
several jobs and then cancelled them in webtools - again no luck on
kicking off the remaining jobs. One other item I noted in the thread
list is there are 5 sleeping default-invoker-thread-xxx threads at the
top of the page but I only see of these threads in the Java threads
listed below (not sure if this means anything). 
 
    I'm not sure where to look next. Can anyone point me in the right
direction on how to track this issue down. Thanks.
 
-Mike

Re: Stalled jobs

Posted by BJ Freeman <bj...@free-man.net>.
Sept of last year andy did some work one the service engine.
http://svn.apache.org/viewvc?view=rev&revision=575074
http://svn.apache.org/viewvc?view=rev&revision=577097
not sure if this is related.


Mike Baschky sent the following on 1/18/2008 12:04 PM:
> The version is approximately version 4 (or pre 4). I don't remember the
> actual svn number because we pulled it into our own svn system. It was
> pulled around the middle of March of last year (2007). 
> 
> -----Original Message-----
> From: BJ Freeman [mailto:bjfree@free-man.net] 
> Sent: Friday, January 18, 2008 12:40 PM
> To: user@ofbiz.apache.org
> Subject: Re: Stalled jobs
> 
> Please state you version of ofbiz and the SVN number.
> 
> Mike Baschky sent the following on 1/18/2008 9:09 AM:
>> Hey All,
>>     I've got an odd issue I'm trying figure out. My service engine 
>> does not appear to be running jobs anymore. It appears to be just 
>> sitting there doing nothing (even for the standard ofbiz jobs like 
>> purgeOldJobs). When I look into the Jobsandbox table I see several 
>> jobs in running status but nothing is happening. These jobs are a 
>> several days old so I'm guessing they are not really running. I've 
>> shut the system down an restarted but still nothing seems to happen. 
>> I'm not really seeing any error messages that help me out here. 
>> Thinking that maybe I've hit some sort of job limit I banked out the 
>> statusId on several jobs and then cancelled them in webtools - again 
>> no luck on kicking off the remaining jobs. One other item I noted in 
>> the thread list is there are 5 sleeping default-invoker-thread-xxx 
>> threads at the top of the page but I only see of these threads in the 
>> Java threads listed below (not sure if this means anything).
>>  
>>     I'm not sure where to look next. Can anyone point me in the right 
>> direction on how to track this issue down. Thanks.
>>  
>> -Mike
>>
> 
> 
> 
> 


RE: Stalled jobs

Posted by Mike Baschky <mb...@go-integral.com>.
The version is approximately version 4 (or pre 4). I don't remember the
actual svn number because we pulled it into our own svn system. It was
pulled around the middle of March of last year (2007). 

-----Original Message-----
From: BJ Freeman [mailto:bjfree@free-man.net] 
Sent: Friday, January 18, 2008 12:40 PM
To: user@ofbiz.apache.org
Subject: Re: Stalled jobs

Please state you version of ofbiz and the SVN number.

Mike Baschky sent the following on 1/18/2008 9:09 AM:
> Hey All,
>     I've got an odd issue I'm trying figure out. My service engine 
> does not appear to be running jobs anymore. It appears to be just 
> sitting there doing nothing (even for the standard ofbiz jobs like 
> purgeOldJobs). When I look into the Jobsandbox table I see several 
> jobs in running status but nothing is happening. These jobs are a 
> several days old so I'm guessing they are not really running. I've 
> shut the system down an restarted but still nothing seems to happen. 
> I'm not really seeing any error messages that help me out here. 
> Thinking that maybe I've hit some sort of job limit I banked out the 
> statusId on several jobs and then cancelled them in webtools - again 
> no luck on kicking off the remaining jobs. One other item I noted in 
> the thread list is there are 5 sleeping default-invoker-thread-xxx 
> threads at the top of the page but I only see of these threads in the 
> Java threads listed below (not sure if this means anything).
>  
>     I'm not sure where to look next. Can anyone point me in the right 
> direction on how to track this issue down. Thanks.
>  
> -Mike
> 


Re: Stalled jobs

Posted by BJ Freeman <bj...@free-man.net>.
Please state you version of ofbiz and the SVN number.

Mike Baschky sent the following on 1/18/2008 9:09 AM:
> Hey All,
>     I've got an odd issue I'm trying figure out. My service engine does
> not appear to be running jobs anymore. It appears to be just sitting
> there doing nothing (even for the standard ofbiz jobs like
> purgeOldJobs). When I look into the Jobsandbox table I see several jobs
> in running status but nothing is happening. These jobs are a several
> days old so I'm guessing they are not really running. I've shut the
> system down an restarted but still nothing seems to happen. I'm not
> really seeing any error messages that help me out here. Thinking that
> maybe I've hit some sort of job limit I banked out the statusId on
> several jobs and then cancelled them in webtools - again no luck on
> kicking off the remaining jobs. One other item I noted in the thread
> list is there are 5 sleeping default-invoker-thread-xxx threads at the
> top of the page but I only see of these threads in the Java threads
> listed below (not sure if this means anything). 
>  
>     I'm not sure where to look next. Can anyone point me in the right
> direction on how to track this issue down. Thanks.
>  
> -Mike
> 


RE: Stalled jobs

Posted by Mike Baschky <mb...@go-integral.com>.
Thanks David. 

-----Original Message-----
From: David E Jones [mailto:jonesde@hotwaxmedia.com] 
Sent: Friday, January 18, 2008 11:51 AM
To: user@ofbiz.apache.org
Subject: Re: Stalled jobs


Right now the service engine will look at running jobs on startup and do
some resetting to put them back into persisted job queue.

For long-term scheduled jobs it is important to watch them around these
events, and restart them manually if needed (which can be done from
WebTools unless you have an older version). When manually killing them
it's good to look at the thread pool (in webtools too) of all servers
running against the database to make sure it isn't running.  
For some services it won't do any harm to have multiple copies running,
but will affect performance and server load.

-David


On Jan 18, 2008, at 10:45 AM, Mike Baschky wrote:

> Thanks Vince. My guess is my problem is in line with your second point

> - server stopped while the jobs are running. In this case did you 
> delete the jobs from the JobSandbox entity and manually re-schedule?
>
> -----Original Message-----
> From: Vince M. Clark [mailto:vclark@globalera.com]
> Sent: Friday, January 18, 2008 11:17 AM
> To: user@ofbiz.apache.org
> Subject: Re: Stalled jobs
>
> Mike
>
> This won't give you an answer, but maybe some helpful insight. I have 
> not had a problem with standard jobs but have had a problem with jobs 
> we added to do synchronization with POS terminals. I've seen two 
> situations where the system seems to leave a job in a "running" status

> and not clear it out and reschedule on a restart.
> 1) Server encounters a heap space (out of memory) error. - Solved by 
> increasing max memory in startup.sh
> 2) Server (or POS client in our case) is stopped while a job is 
> running.
> - Haven't implemented a solution but we are planning on a "graceful 
> shutdown" on our POS terminals to make sure all entity sync jobs are 
> finished before shutting down.
>
>
> ----- Original Message -----
> From: "Mike Baschky" <mb...@go-integral.com>
> To: user@ofbiz.apache.org
> Sent: Friday, January 18, 2008 10:09:51 AM (GMT-0700) America/Denver
> Subject: Stalled jobs
>
> Hey All,
> I've got an odd issue I'm trying figure out. My service engine does
> not appear to be running jobs anymore. It appears to be just sitting
> there doing nothing (even for the standard ofbiz jobs like
> purgeOldJobs). When I look into the Jobsandbox table I see several  
> jobs
> in running status but nothing is happening. These jobs are a several
> days old so I'm guessing they are not really running. I've shut the
> system down an restarted but still nothing seems to happen. I'm not
> really seeing any error messages that help me out here. Thinking that
> maybe I've hit some sort of job limit I banked out the statusId on
> several jobs and then cancelled them in webtools - again no luck on
> kicking off the remaining jobs. One other item I noted in the thread
> list is there are 5 sleeping default-invoker-thread-xxx threads at the
> top of the page but I only see of these threads in the Java threads
> listed below (not sure if this means anything).
>
> I'm not sure where to look next. Can anyone point me in the right
> direction on how to track this issue down. Thanks.
>
> -Mike


Re: Stalled jobs

Posted by David E Jones <jo...@hotwaxmedia.com>.
Right now the service engine will look at running jobs on startup and  
do some resetting to put them back into persisted job queue.

For long-term scheduled jobs it is important to watch them around  
these events, and restart them manually if needed (which can be done  
from WebTools unless you have an older version). When manually killing  
them it's good to look at the thread pool (in webtools too) of all  
servers running against the database to make sure it isn't running.  
For some services it won't do any harm to have multiple copies  
running, but will affect performance and server load.

-David


On Jan 18, 2008, at 10:45 AM, Mike Baschky wrote:

> Thanks Vince. My guess is my problem is in line with your second  
> point -
> server stopped while the jobs are running. In this case did you delete
> the jobs from the JobSandbox entity and manually re-schedule?
>
> -----Original Message-----
> From: Vince M. Clark [mailto:vclark@globalera.com]
> Sent: Friday, January 18, 2008 11:17 AM
> To: user@ofbiz.apache.org
> Subject: Re: Stalled jobs
>
> Mike
>
> This won't give you an answer, but maybe some helpful insight. I have
> not had a problem with standard jobs but have had a problem with  
> jobs we
> added to do synchronization with POS terminals. I've seen two  
> situations
> where the system seems to leave a job in a "running" status and not
> clear it out and reschedule on a restart.
> 1) Server encounters a heap space (out of memory) error. - Solved by
> increasing max memory in startup.sh
> 2) Server (or POS client in our case) is stopped while a job is  
> running.
> - Haven't implemented a solution but we are planning on a "graceful
> shutdown" on our POS terminals to make sure all entity sync jobs are
> finished before shutting down.
>
>
> ----- Original Message -----
> From: "Mike Baschky" <mb...@go-integral.com>
> To: user@ofbiz.apache.org
> Sent: Friday, January 18, 2008 10:09:51 AM (GMT-0700) America/Denver
> Subject: Stalled jobs
>
> Hey All,
> I've got an odd issue I'm trying figure out. My service engine does
> not appear to be running jobs anymore. It appears to be just sitting
> there doing nothing (even for the standard ofbiz jobs like
> purgeOldJobs). When I look into the Jobsandbox table I see several  
> jobs
> in running status but nothing is happening. These jobs are a several
> days old so I'm guessing they are not really running. I've shut the
> system down an restarted but still nothing seems to happen. I'm not
> really seeing any error messages that help me out here. Thinking that
> maybe I've hit some sort of job limit I banked out the statusId on
> several jobs and then cancelled them in webtools - again no luck on
> kicking off the remaining jobs. One other item I noted in the thread
> list is there are 5 sleeping default-invoker-thread-xxx threads at the
> top of the page but I only see of these threads in the Java threads
> listed below (not sure if this means anything).
>
> I'm not sure where to look next. Can anyone point me in the right
> direction on how to track this issue down. Thanks.
>
> -Mike


Re: Stalled jobs

Posted by "Vince M. Clark" <vc...@globalera.com>.
Yes, I had to delete the record showing "running" from the sandbox and reschedule. It has been a while since this happened so my memory is a bit vague. I may have also deleted some records from the entity sync tables that tracked when the last sync occurred. But that would be specific to entity sync jobs only. 

----- Original Message ----- 
From: "Mike Baschky" <mb...@go-integral.com> 
To: user@ofbiz.apache.org 
Sent: Friday, January 18, 2008 10:45:19 AM (GMT-0700) America/Denver 
Subject: RE: Stalled jobs 

Thanks Vince. My guess is my problem is in line with your second point - 
server stopped while the jobs are running. In this case did you delete 
the jobs from the JobSandbox entity and manually re-schedule? 

-----Original Message----- 
From: Vince M. Clark [mailto:vclark@globalera.com] 
Sent: Friday, January 18, 2008 11:17 AM 
To: user@ofbiz.apache.org 
Subject: Re: Stalled jobs 

Mike 

This won't give you an answer, but maybe some helpful insight. I have 
not had a problem with standard jobs but have had a problem with jobs we 
added to do synchronization with POS terminals. I've seen two situations 
where the system seems to leave a job in a "running" status and not 
clear it out and reschedule on a restart. 
1) Server encounters a heap space (out of memory) error. - Solved by 
increasing max memory in startup.sh 
2) Server (or POS client in our case) is stopped while a job is running. 
- Haven't implemented a solution but we are planning on a "graceful 
shutdown" on our POS terminals to make sure all entity sync jobs are 
finished before shutting down. 


----- Original Message ----- 
From: "Mike Baschky" <mb...@go-integral.com> 
To: user@ofbiz.apache.org 
Sent: Friday, January 18, 2008 10:09:51 AM (GMT-0700) America/Denver 
Subject: Stalled jobs 

Hey All, 
I've got an odd issue I'm trying figure out. My service engine does 
not appear to be running jobs anymore. It appears to be just sitting 
there doing nothing (even for the standard ofbiz jobs like 
purgeOldJobs). When I look into the Jobsandbox table I see several jobs 
in running status but nothing is happening. These jobs are a several 
days old so I'm guessing they are not really running. I've shut the 
system down an restarted but still nothing seems to happen. I'm not 
really seeing any error messages that help me out here. Thinking that 
maybe I've hit some sort of job limit I banked out the statusId on 
several jobs and then cancelled them in webtools - again no luck on 
kicking off the remaining jobs. One other item I noted in the thread 
list is there are 5 sleeping default-invoker-thread-xxx threads at the 
top of the page but I only see of these threads in the Java threads 
listed below (not sure if this means anything). 

I'm not sure where to look next. Can anyone point me in the right 
direction on how to track this issue down. Thanks. 

-Mike 

RE: Stalled jobs

Posted by Mike Baschky <mb...@go-integral.com>.
Thanks Vince. My guess is my problem is in line with your second point -
server stopped while the jobs are running. In this case did you delete
the jobs from the JobSandbox entity and manually re-schedule? 

-----Original Message-----
From: Vince M. Clark [mailto:vclark@globalera.com] 
Sent: Friday, January 18, 2008 11:17 AM
To: user@ofbiz.apache.org
Subject: Re: Stalled jobs

Mike 

This won't give you an answer, but maybe some helpful insight. I have
not had a problem with standard jobs but have had a problem with jobs we
added to do synchronization with POS terminals. I've seen two situations
where the system seems to leave a job in a "running" status and not
clear it out and reschedule on a restart. 
1) Server encounters a heap space (out of memory) error. - Solved by
increasing max memory in startup.sh
2) Server (or POS client in our case) is stopped while a job is running.
- Haven't implemented a solution but we are planning on a "graceful
shutdown" on our POS terminals to make sure all entity sync jobs are
finished before shutting down. 


----- Original Message ----- 
From: "Mike Baschky" <mb...@go-integral.com> 
To: user@ofbiz.apache.org 
Sent: Friday, January 18, 2008 10:09:51 AM (GMT-0700) America/Denver 
Subject: Stalled jobs 

Hey All, 
I've got an odd issue I'm trying figure out. My service engine does 
not appear to be running jobs anymore. It appears to be just sitting 
there doing nothing (even for the standard ofbiz jobs like 
purgeOldJobs). When I look into the Jobsandbox table I see several jobs 
in running status but nothing is happening. These jobs are a several 
days old so I'm guessing they are not really running. I've shut the 
system down an restarted but still nothing seems to happen. I'm not 
really seeing any error messages that help me out here. Thinking that 
maybe I've hit some sort of job limit I banked out the statusId on 
several jobs and then cancelled them in webtools - again no luck on 
kicking off the remaining jobs. One other item I noted in the thread 
list is there are 5 sleeping default-invoker-thread-xxx threads at the 
top of the page but I only see of these threads in the Java threads 
listed below (not sure if this means anything). 

I'm not sure where to look next. Can anyone point me in the right 
direction on how to track this issue down. Thanks. 

-Mike 

Re: Stalled jobs

Posted by "Vince M. Clark" <vc...@globalera.com>.
Mike 

This won't give you an answer, but maybe some helpful insight. I have not had a problem with standard jobs but have had a problem with jobs we added to do synchronization with POS terminals. I've seen two situations where the system seems to leave a job in a "running" status and not clear it out and reschedule on a restart. 
1) Server encounters a heap space (out of memory) error. - Solved by increasing max memory in startup.sh 
2) Server (or POS client in our case) is stopped while a job is running. - Haven't implemented a solution but we are planning on a "graceful shutdown" on our POS terminals to make sure all entity sync jobs are finished before shutting down. 


----- Original Message ----- 
From: "Mike Baschky" <mb...@go-integral.com> 
To: user@ofbiz.apache.org 
Sent: Friday, January 18, 2008 10:09:51 AM (GMT-0700) America/Denver 
Subject: Stalled jobs 

Hey All, 
I've got an odd issue I'm trying figure out. My service engine does 
not appear to be running jobs anymore. It appears to be just sitting 
there doing nothing (even for the standard ofbiz jobs like 
purgeOldJobs). When I look into the Jobsandbox table I see several jobs 
in running status but nothing is happening. These jobs are a several 
days old so I'm guessing they are not really running. I've shut the 
system down an restarted but still nothing seems to happen. I'm not 
really seeing any error messages that help me out here. Thinking that 
maybe I've hit some sort of job limit I banked out the statusId on 
several jobs and then cancelled them in webtools - again no luck on 
kicking off the remaining jobs. One other item I noted in the thread 
list is there are 5 sleeping default-invoker-thread-xxx threads at the 
top of the page but I only see of these threads in the Java threads 
listed below (not sure if this means anything). 

I'm not sure where to look next. Can anyone point me in the right 
direction on how to track this issue down. Thanks. 

-Mike