You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ofbiz.apache.org by Brett Palmer <br...@gmail.com> on 2012/08/22 20:04:15 UTC

Questions on new service engine and job poller changes

*Adrian,

I’ve updated to the latest ofbiz code (revision 1374598) and trying to
setup our code to use the new changes to the service engine and job poller.


Here are a few questions:


1. Instantiating a new dispatcher to run a service.

We use to instantiate a LocalDispatcher to run a service with the following
code:

   LocalDispatcher olapDispatcher =
GenericDispatcher.getLocalDispatcher(“some dispatcher Name”, olapDelegator);


Now it looks like we have a Factor object that creates the dispatcher if
one is not already created with that name.  The method is
createLocallDispatcher but its not a static method and so a
GenericDispatcherFactory needs to be instantiated first.

  LocalDispatcher olapDispatcher =
GenericDispatcherFactory.createLocalDispatcher(dbConfig, olapDelegator);


How should I be instantiating the GenericDispatcherFactory or is there
preferred way to run a service from code?


2. Is the “wait-millis” attribute still required?  The service-config.xsd
still lists it as a required attribute for thread-pool but I don’t see it
reference anywhere in the code.  If it is needed how does it work?


3. If I understand the service configuration file, it looks like I can
configure the service engine to work against multiple pools (see example
config below).  If I wanted to run some services in specific pools can I
use the LocalDispatcher.scedule() method and just have an immediate time to
run but specify the pool I want them to use.

We need this functionality for our data warehouse processing.  We try to
provide real time reports but our database cannot handle a high number of
data warehouse updates during heavy loads.   By configuring only one server
to service a particular pool we can limit the number of concurrent
processes running those services.


       <thread-pool send-to-pool="pool"
                    purge-job-days="4"
                    failed-retry-min="3"
                    ttl="120000"
                    jobs="100"
                    min-threads="2"
                    max-threads="5"
                    wait-millis="1000"
                    poll-enabled="true"
                    poll-db-millis="30000">
           <run-from-pool name="pool"/>
           <run-from-pool name="dwPool"/>
       </thread-pool>

Thanks in advance for your help.  I’ll continue to test the new
configuration as soon as I can get these answers.


Brett*

Re: Questions on new service engine and job poller changes

Posted by Adrian Crum <ad...@sandglass-software.com>.

On 8/23/2012 4:42 PM, Brett Palmer wrote:
> *Adrian,
>
> Thanks for the information.  Please see my questions inline:*
>
> On Thu, Aug 23, 2012 at 6:24 AM, Adrian Crum <
> adrian.crum@sandglass-software.com> wrote:
>
>> On 8/23/2012 8:46 AM, Adrian Crum wrote:
>>
>>> On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality for
>>> our data warehouse processing.  We try to
>>>
>>>   provide real time reports but our database cannot handle a high number of
>>>> data warehouse updates during heavy loads.   By configuring only one
>>>> server
>>>> to service a particular pool we can limit the number of concurrent
>>>> processes running those services.
>>>>
>>>>
>>>>          <thread-pool send-to-pool="pool"
>>>>                       purge-job-days="4"
>>>>                       failed-retry-min="3"
>>>>                       ttl="120000"
>>>>                       jobs="100"
>>>>                       min-threads="2"
>>>>                       max-threads="5"
>>>>                       wait-millis="1000"
>>>>                       poll-enabled="true"
>>>>                       poll-db-millis="30000">
>>>>              <run-from-pool name="pool"/>
>>>>              <run-from-pool name="dwPool"/>
>>>>          </thread-pool>
>>>>
>>>
>>> That configuration will work. That server will service the two pools.
>>>
>> I forgot to mention, if you're running lots of jobs, then you will want to
>> increase the jobs (queue size) value. You mentioned in another thread that
>> your application will run up to 10,000 jobs - in that case you should
>> increase the jobs value to 1000 or more. The queue size affects memory, so
>> there is an interaction between responsiveness and memory use.
>>
>>
> *Thanks for the information that is very helpful.*
>
>
>
>> The potential problem with the Job Poller (before and after the overhaul)
>> is with asynchronous service calls (not scheduled jobs). When you run an
>> async service, the service engine converts the service call to a job and
>> places it in the queue. It is not persisted like scheduled jobs. If the Job
>> Poller has just filled the queue with scheduled jobs, then there is no room
>> for async services, and any attempt to queue an async service will fail
>> (throws an exception "Unable to queue job").
>>
>>
> *I assume the “queue” is a memory queue and not the same as the JobSandBox
> pool that is stored in the database which is why there is a limit to the
> queue.  Let me know if that assumption is not correct.

That is correct. The queue size limit was put there to prevent the Job 
Scheduler from saturating or crashing the server.

During a polling interval, the Job Manager will fill the queue with jobs 
scheduled to run. Any jobs that don't fit in the queue will be queued 
during the next polling interval. Queue service threads will run the 
queued jobs. Creating too many queue service threads will slow down 
queue throughput because of Thread maintenance overhead. So, there are 
some parameters for users to tweak and they interact with each other, 
but the overall objective is to configure the Job Scheduler so that it 
has good throughput but doesn't run out of control and swamp the server.

>
> If you run an async service and set the “persist” option to true will you
> still hit the Job Poller limit or will the job be persisted and run when
> the Job Poller has sufficient resources?*

The async service will be persisted as a job scheduled to run now. The 
job will be in the pool specified in the <thread-pool> send-to-pool 
attribute.

>
>
>> I designed the new code so the service engine can check for that
>> possibility, but I didn't change the service engine behavior. Instead,
>> users should configure their <thread-pool> element(s) and applications
>> carefully. For example, if your application schedules lots of jobs, then
>> design it in a way that it schedules no more than (queue size - n) jobs at
>> a time - to leave room for async services. Another option would be to have
>> a server dedicated to servicing scheduled jobs - that way the potential
>> clash with async services is not an issue.
>>
>>
> *I wasn’t aware that the same queue was shared between async jobs and
> scheduled jobs - thanks again for the update.
>
> We like the idea of dedicating an app server to service specific scheduled
> jobs as it controls the number of concurrent processes we run in
> production.
>
> I’m still curious why the service engine dispatcher does not have an API to
> run an async service to a specified “pool”.  This seems like a simple
> addition since there is an API to schedule a job to run in a specific pool.
>   I understand that there is potential this could fail if the queue is full
> (unless my question above about the persisted job is a possible
> workaround).


If persist is true, then the async service will be assigned to the pool 
specified in the <thread-pool> send-to-pool attribute. If persist is 
false, then specifying a  job pool would have no affect.

We could create an "async-service-only" queue that would be unaffected 
by persisted jobs, but it can still be overrun. That's why I changed the 
code to allow the service engine to check for that possibility. I don't 
know what OFBiz should do by default in those scenarios, so I thought it 
best to leave the async service behavior the same (an exception is 
thrown). In other words, we could create the extra queue to give users a 
warm fuzzy feeling, but the same basic problem will still exist. I 
believe it is best to make it clear that, because of their nature, 
non-persisted async services are not guaranteed to run.


>
>  From your provided information here is how we would likely use the new
> changes with the service engine and job poller:
>
> Background: Our application is an online testing application with multiple
> ofbiz servers and a single ofbiz data warehouse.  Tests are taken on the
> dedicated app servers and when the test is done a data warehouse process
> picks up the tests and processes them for the data warehouse reports.  The
> reports are near real time but during heavy testing periods we want to
> limit how many concurrent warehouse processes are running.  Here are the
> steps in the process:
>
> 1. Configure a limited number of ofbiz servers to process scheduled data
> warehouse jobs that are submitted to a specific job pool (i.e. dwPool).
>
> 2. When a person has completed a test the application creates a scheduled
> job with a current timestamp for when the service should be run.  The
> scheduled job would be assigned to the “dwPool”.  The servers configured in
> item 1 above would then process these jobs.

That sounds like a good strategy. An improvement would be to have the 
job servers service all pools. In that configuration the online testing 
application servers would have the <thread-pool> poll-enabled attribute 
set to "false" - so they will not run any jobs themselves. The only 
bottleneck would be the data source - and that bottleneck can be fixed 
by putting the JobSandbox entity on a separate data source and use a 
"jobs only" delegator.

>
> The above steps allow us to scale our solution horizontally by adding more
> ofbiz servers to handle online testing as needed.  We are still able to
> handle near real time reporting as we have dedicated servers assigned to
> process data warehouse requests.  During light testing days the warehouse
> scheduled jobs process almost immediately and during heavy testing days
> they lag slightly depending on the service request rate.
>
> Question:
>
> If a scheduled job is set with a current timestamp for the “startTime”, but
> the JobPoller is behind because of a large number of scheduled service
> requests, will the JobPoller still pick up the scheduled job according to
> the order of startTime?
>
> Here is a specific example:
>
> Current time:  Aug. 23, 10:00AM
>
> - A schedule job is created with a start time of Aug. 23, 10:00AM
> - JobPoller finishes processing current queue of jobs at timestamp:  Aug.
> 23, 10:05AM
> - JobPoller queries data for the next list of jobs to process.
>
> Question: Will it pick up the jobs scheduled for Aug. 23, 10:00AM even
> though the current time is past that time?

Yes, the Job Manager will retrieve all jobs scheduled to start prior to now.

-Adrian

Re: Questions on new service engine and job poller changes

Posted by Brett Palmer <br...@gmail.com>.

*Adrian,

Thanks for the information.  Please see my questions inline:*

On Thu, Aug 23, 2012 at 6:24 AM, Adrian Crum <
adrian.crum@sandglass-software.com> wrote:

> On 8/23/2012 8:46 AM, Adrian Crum wrote:
>
>>
>> On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality for
>> our data warehouse processing.  We try to
>>
>>  provide real time reports but our database cannot handle a high number of
>>> data warehouse updates during heavy loads.   By configuring only one
>>> server
>>> to service a particular pool we can limit the number of concurrent
>>> processes running those services.
>>>
>>>
>>>         <thread-pool send-to-pool="pool"
>>>                      purge-job-days="4"
>>>                      failed-retry-min="3"
>>>                      ttl="120000"
>>>                      jobs="100"
>>>                      min-threads="2"
>>>                      max-threads="5"
>>>                      wait-millis="1000"
>>>                      poll-enabled="true"
>>>                      poll-db-millis="30000">
>>>             <run-from-pool name="pool"/>
>>>             <run-from-pool name="dwPool"/>
>>>         </thread-pool>
>>>
>>
>>
>> That configuration will work. That server will service the two pools.
>>
>
> I forgot to mention, if you're running lots of jobs, then you will want to
> increase the jobs (queue size) value. You mentioned in another thread that
> your application will run up to 10,000 jobs - in that case you should
> increase the jobs value to 1000 or more. The queue size affects memory, so
> there is an interaction between responsiveness and memory use.
>
>
*Thanks for the information that is very helpful.*

> The potential problem with the Job Poller (before and after the overhaul)
> is with asynchronous service calls (not scheduled jobs). When you run an
> async service, the service engine converts the service call to a job and
> places it in the queue. It is not persisted like scheduled jobs. If the Job
> Poller has just filled the queue with scheduled jobs, then there is no room
> for async services, and any attempt to queue an async service will fail
> (throws an exception "Unable to queue job").
>
>
*I assume the “queue” is a memory queue and not the same as the JobSandBox
pool that is stored in the database which is why there is a limit to the
queue.  Let me know if that assumption is not correct.

If you run an async service and set the “persist” option to true will you
still hit the Job Poller limit or will the job be persisted and run when
the Job Poller has sufficient resources?*

> I designed the new code so the service engine can check for that
> possibility, but I didn't change the service engine behavior. Instead,
> users should configure their <thread-pool> element(s) and applications
> carefully. For example, if your application schedules lots of jobs, then
> design it in a way that it schedules no more than (queue size - n) jobs at
> a time - to leave room for async services. Another option would be to have
> a server dedicated to servicing scheduled jobs - that way the potential
> clash with async services is not an issue.
>
>
*I wasn’t aware that the same queue was shared between async jobs and
scheduled jobs - thanks again for the update.

We like the idea of dedicating an app server to service specific scheduled
jobs as it controls the number of concurrent processes we run in
production.

I’m still curious why the service engine dispatcher does not have an API to
run an async service to a specified “pool”.  This seems like a simple
addition since there is an API to schedule a job to run in a specific pool.
 I understand that there is potential this could fail if the queue is full
(unless my question above about the persisted job is a possible
workaround).

>From your provided information here is how we would likely use the new
changes with the service engine and job poller:

Background: Our application is an online testing application with multiple
ofbiz servers and a single ofbiz data warehouse.  Tests are taken on the
dedicated app servers and when the test is done a data warehouse process
picks up the tests and processes them for the data warehouse reports.  The
reports are near real time but during heavy testing periods we want to
limit how many concurrent warehouse processes are running.  Here are the
steps in the process:

1. Configure a limited number of ofbiz servers to process scheduled data
warehouse jobs that are submitted to a specific job pool (i.e. dwPool).

2. When a person has completed a test the application creates a scheduled
job with a current timestamp for when the service should be run.  The
scheduled job would be assigned to the “dwPool”.  The servers configured in
item 1 above would then process these jobs.

The above steps allow us to scale our solution horizontally by adding more
ofbiz servers to handle online testing as needed.  We are still able to
handle near real time reporting as we have dedicated servers assigned to
process data warehouse requests.  During light testing days the warehouse
scheduled jobs process almost immediately and during heavy testing days
they lag slightly depending on the service request rate.

Question:

If a scheduled job is set with a current timestamp for the “startTime”, but
the JobPoller is behind because of a large number of scheduled service
requests, will the JobPoller still pick up the scheduled job according to
the order of startTime?

Here is a specific example:

Current time:  Aug. 23, 10:00AM

- A schedule job is created with a start time of Aug. 23, 10:00AM
- JobPoller finishes processing current queue of jobs at timestamp:  Aug.
23, 10:05AM
- JobPoller queries data for the next list of jobs to process.

Question: Will it pick up the jobs scheduled for Aug. 23, 10:00AM even
though the current time is past that time?

Thanks in advance for your response.

Brett*

Re: Questions on new service engine and job poller changes

Posted by Adrian Crum <ad...@sandglass-software.com>.

On 8/23/2012 8:46 AM, Adrian Crum wrote:
>
> On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality 
> for our data warehouse processing.  We try to
>> provide real time reports but our database cannot handle a high 
>> number of
>> data warehouse updates during heavy loads.   By configuring only one 
>> server
>> to service a particular pool we can limit the number of concurrent
>> processes running those services.
>>
>>
>>         <thread-pool send-to-pool="pool"
>>                      purge-job-days="4"
>>                      failed-retry-min="3"
>>                      ttl="120000"
>>                      jobs="100"
>>                      min-threads="2"
>>                      max-threads="5"
>>                      wait-millis="1000"
>>                      poll-enabled="true"
>>                      poll-db-millis="30000">
>>             <run-from-pool name="pool"/>
>>             <run-from-pool name="dwPool"/>
>>         </thread-pool>
>
>
> That configuration will work. That server will service the two pools.

I forgot to mention, if you're running lots of jobs, then you will want 
to increase the jobs (queue size) value. You mentioned in another thread 
that your application will run up to 10,000 jobs - in that case you 
should increase the jobs value to 1000 or more. The queue size affects 
memory, so there is an interaction between responsiveness and memory use.

The potential problem with the Job Poller (before and after the 
overhaul) is with asynchronous service calls (not scheduled jobs). When 
you run an async service, the service engine converts the service call 
to a job and places it in the queue. It is not persisted like scheduled 
jobs. If the Job Poller has just filled the queue with scheduled jobs, 
then there is no room for async services, and any attempt to queue an 
async service will fail (throws an exception "Unable to queue job").

I designed the new code so the service engine can check for that 
possibility, but I didn't change the service engine behavior. Instead, 
users should configure their <thread-pool> element(s) and applications 
carefully. For example, if your application schedules lots of jobs, then 
design it in a way that it schedules no more than (queue size - n) jobs 
at a time - to leave room for async services. Another option would be to 
have a server dedicated to servicing scheduled jobs - that way the 
potential clash with async services is not an issue.

-Adrian

Re: Questions on new service engine and job poller changes

Posted by Adrian Crum <ad...@sandglass-software.com>.

On 8/22/2012 7:04 PM, Brett Palmer wrote:
> *Adrian,
>
> I’ve updated to the latest ofbiz code (revision 1374598) and trying to
> setup our code to use the new changes to the service engine and job poller.
>
>
> Here are a few questions:
>
>
> 1. Instantiating a new dispatcher to run a service.
>
> We use to instantiate a LocalDispatcher to run a service with the following
> code:
>
>     LocalDispatcher olapDispatcher =
> GenericDispatcher.getLocalDispatcher(“some dispatcher Name”, olapDelegator);
>
>
> Now it looks like we have a Factor object that creates the dispatcher if
> one is not already created with that name.  The method is
> createLocallDispatcher but its not a static method and so a
> GenericDispatcherFactory needs to be instantiated first.
>
>    LocalDispatcher olapDispatcher =
> GenericDispatcherFactory.createLocalDispatcher(dbConfig, olapDelegator);
>
>
> How should I be instantiating the GenericDispatcherFactory or is there
> preferred way to run a service from code?


ServiceContainer.getLocalDispatcher(...)



>
>
> 2. Is the “wait-millis” attribute still required?  The service-config.xsd
> still lists it as a required attribute for thread-pool but I don’t see it
> reference anywhere in the code.  If it is needed how does it work?


It is not used. The schema has been updated to reflect that - make sure 
you are looking at the service-config.xsd file in your local copy.


>
> 3. If I understand the service configuration file, it looks like I can
> configure the service engine to work against multiple pools (see example
> config below).  If I wanted to run some services in specific pools can I
> use the LocalDispatcher.scedule() method and just have an immediate time to
> run but specify the pool I want them to use.


Correct. Just remember the multiple pools share a delegator, so they are 
all in the same data source.


>
> We need this functionality for our data warehouse processing.  We try to
> provide real time reports but our database cannot handle a high number of
> data warehouse updates during heavy loads.   By configuring only one server
> to service a particular pool we can limit the number of concurrent
> processes running those services.
>
>
>         <thread-pool send-to-pool="pool"
>                      purge-job-days="4"
>                      failed-retry-min="3"
>                      ttl="120000"
>                      jobs="100"
>                      min-threads="2"
>                      max-threads="5"
>                      wait-millis="1000"
>                      poll-enabled="true"
>                      poll-db-millis="30000">
>             <run-from-pool name="pool"/>
>             <run-from-pool name="dwPool"/>
>         </thread-pool>


That configuration will work. That server will service the two pools.


>
> Thanks in advance for your help.  I’ll continue to test the new
> configuration as soon as I can get these answers.


Thank you taking the time to test this. I have a client requirement 
similar to yours, but on a smaller scale - so I am very interested in 
how it all works out.

-Adrian