You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ofbiz.apache.org by Scott Gray <sc...@hotwaxmedia.com> on 2009/08/29 14:37:15 UTC

JobManager.reloadCrashedJobs() fails to complete

As the JobSandbox table grows, the time taken to query the database  
for crashed jobs also grows and eventually causes timeouts meaning any  
crashed jobs are no longer loaded.

The query looks like this:
SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND  
CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY  
START_DATE_TIME ASC
which doesn't give the database much to work with and results in most  
of the table's contents being scanned.

If I add statusId into the mix the query returns almost instantly.   
Does anyone see any potential problems if I remove the finishDateTime  
and cancelDateTime fields from the query and instead check for any of  
the following statuses:
SERVICE_PENDING
SERVICE_QUEUED
SERVICE_RUNNING

Thanks
Scott

HotWax Media
http://www.hotwaxmedia.com

Re: JobManager.reloadCrashedJobs() fails to complete

Posted by Scott Gray <sc...@hotwaxmedia.com>.
Thanks David, I'll go ahead and make the change.  My main doubt was  
that aside from UI queries the statusId fields doesn't seem to be used  
for the actual processing of any jobs, reloadCrashedJobs, poll and  
purgeOldJobs all use the date fields only.  The poller is fine because  
it's querying on runByInstanceId which is indexed but purgeOldJobs  
looks like it could potentially suffer from the same problem.

Do you think it would be okay to switch everything over to querying on  
statusId?

Thanks
Scott

On 30/08/2009, at 10:10 AM, David E Jones wrote:

>
> That would make sense to do. I guess when the statudId field was  
> added the query was never adjusted. On a side note, my guess is that  
> the performance difference is due to indexed versus non-indexed  
> fields, and that statudId is indexed automatically because the  
> reverse-foreign-key indexing that the EE does.
>
> -David
>
>
> On Aug 29, 2009, at 6:37 AM, Scott Gray wrote:
>
>> As the JobSandbox table grows, the time taken to query the database  
>> for crashed jobs also grows and eventually causes timeouts meaning  
>> any crashed jobs are no longer loaded.
>>
>> The query looks like this:
>> SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND  
>> CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY  
>> START_DATE_TIME ASC
>> which doesn't give the database much to work with and results in  
>> most of the table's contents being scanned.
>>
>> If I add statusId into the mix the query returns almost instantly.   
>> Does anyone see any potential problems if I remove the  
>> finishDateTime and cancelDateTime fields from the query and instead  
>> check for any of the following statuses:
>> SERVICE_PENDING
>> SERVICE_QUEUED
>> SERVICE_RUNNING
>>
>> Thanks
>> Scott
>>
>> HotWax Media
>> http://www.hotwaxmedia.com
>


Re: JobManager.reloadCrashedJobs() fails to complete

Posted by David E Jones <de...@me.com>.
That would make sense to do. I guess when the statudId field was added  
the query was never adjusted. On a side note, my guess is that the  
performance difference is due to indexed versus non-indexed fields,  
and that statudId is indexed automatically because the reverse-foreign- 
key indexing that the EE does.

-David


On Aug 29, 2009, at 6:37 AM, Scott Gray wrote:

> As the JobSandbox table grows, the time taken to query the database  
> for crashed jobs also grows and eventually causes timeouts meaning  
> any crashed jobs are no longer loaded.
>
> The query looks like this:
> SELECT * FROM JOB_SANDBOX WHERE (FINISH_DATE_TIME IS NULL AND  
> CANCEL_DATE_TIME IS NULL AND RUN_BY_INSTANCE_ID = ?) ORDER BY  
> START_DATE_TIME ASC
> which doesn't give the database much to work with and results in  
> most of the table's contents being scanned.
>
> If I add statusId into the mix the query returns almost instantly.   
> Does anyone see any potential problems if I remove the  
> finishDateTime and cancelDateTime fields from the query and instead  
> check for any of the following statuses:
> SERVICE_PENDING
> SERVICE_QUEUED
> SERVICE_RUNNING
>
> Thanks
> Scott
>
> HotWax Media
> http://www.hotwaxmedia.com