You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@felix.apache.org by Bob Paulin <bo...@bobpaulin.com> on 2014/09/15 16:02:48 UTC

Event Admin: Sync Event Blacklist timing

The locking that is done for the blacklist timing seems to degrade 
performance significantly Felix is under stress with multiple firing 
handler callbacks for each event.  I'd like to discuss an alternative 
approach with less locking that still  guarantees proper event ordering 
per the OSGi spec.  Basically instead of using the CyclicBarriers 
(Rendezvous) on a per handler basis we could use a count down latch to 
only await after all handlers are complete. Then instead of using a 
stopwatch based timer the JMX Current Thread Cpu Time which counts CPU 
time for the application code and any IO performed on it's behalf 
filtering out time context switching between threads to provide proper 
blacklisting.  I've created FELIX-4638 with a patch.

Here are my test results.

Baseline(Event Admin 1.4.2):
15 Threads
100000 Async Events per Thread
7 Active Handlers per Event

For a total of 10500000 Handler Events Executed in 40000 - 45000ms

With the same parameters above but a CountDownLatch I see the execution 
time drop to around 25000ms.   The improvement is noticeable because the 
stress test includes 7 active handlers per event.  The improvement is 
less noticeable with applications that only register one or 2 handlers 
for an active event such as in the PerformanceTestIT.  Thoughts on 
changing how this locking occurs? Concerns with using the JMX timings?

- Bob

Re: Event Admin: Sync Event Blacklist timing

Posted by Bob Paulin <bo...@bobpaulin.com>.

Carsten,

I paired the the CountDownLatch with the JMX timings when I saw some 
events getting blacklisted due to thread scheduling rather than the 
actual time spent executing code in the handler.

For example it appeared that there are some cases where a thread starts 
executing a handler event enough to trigger the start time and then 
through no fault of the handler goes to sleep.  The thread then awakes 
after some time and executes the handler task but it appears that the 
handler has taken more time than it really has so the handler is 
blacklisted incorrectly.  The JMX timings do not include the time that 
the thread is unscheduled since it is based on the CPU running time and 
IO time (assuming this JMX feature is supported).  I have it coded to 
fall back to System.currentTimeMillis() when this is not available.

The issue of blacklisting only after handler completion is certainly a 
problem.  I've attached a new patch to the issue that creates a smarter 
countdownlatch(BlacklistLatch) to deal with this issue.  It does make 
the change more involved as the blacklist checking occurs on the calling 
thread on a specific timeout interval.  Please let me know what your 
thoughts are on this approach.  Thanks!

- Bob

On 9/16/2014 10:30 AM, Carsten Ziegeler wrote:
> Hi Bob,
>
> yes, I agree using a CountDownLatch seems to be the better option. I'm not
> sure about the JMX timings though.
> However, with your patch in place, there is a difference in blacklisting.
> Right now, a handler is blacklisted immediately if the timeout is reached,
> this avoids sending new events to that handler, while the current event is
> still processed by other handlers. With your patch, the handler is only
> blacklisted once it's finished (at least I think this is the case)
>
> Regards
> Carsten
>
> 2014-09-15 16:02 GMT+02:00 Bob Paulin<bo...@bobpaulin.com>:
>
>> The locking that is done for the blacklist timing seems to degrade
>> performance significantly Felix is under stress with multiple firing
>> handler callbacks for each event.  I'd like to discuss an alternative
>> approach with less locking that still  guarantees proper event ordering per
>> the OSGi spec.  Basically instead of using the CyclicBarriers (Rendezvous)
>> on a per handler basis we could use a count down latch to only await after
>> all handlers are complete. Then instead of using a stopwatch based timer
>> the JMX Current Thread Cpu Time which counts CPU time for the application
>> code and any IO performed on it's behalf filtering out time context
>> switching between threads to provide proper blacklisting.  I've created
>> FELIX-4638 with a patch.
>>
>> Here are my test results.
>>
>> Baseline(Event Admin 1.4.2):
>> 15 Threads
>> 100000 Async Events per Thread
>> 7 Active Handlers per Event
>>
>> For a total of 10500000 Handler Events Executed in 40000 - 45000ms
>>
>> With the same parameters above but a CountDownLatch I see the execution
>> time drop to around 25000ms.   The improvement is noticeable because the
>> stress test includes 7 active handlers per event.  The improvement is less
>> noticeable with applications that only register one or 2 handlers for an
>> active event such as in the PerformanceTestIT.  Thoughts on changing how
>> this locking occurs? Concerns with using the JMX timings?
>>
>> - Bob
>>
>>

Re: Event Admin: Sync Event Blacklist timing

Posted by Carsten Ziegeler <cz...@apache.org>.

Hi Bob,

yes, I agree using a CountDownLatch seems to be the better option. I'm not
sure about the JMX timings though.
However, with your patch in place, there is a difference in blacklisting.
Right now, a handler is blacklisted immediately if the timeout is reached,
this avoids sending new events to that handler, while the current event is
still processed by other handlers. With your patch, the handler is only
blacklisted once it's finished (at least I think this is the case)

Regards
Carsten

2014-09-15 16:02 GMT+02:00 Bob Paulin <bo...@bobpaulin.com>:

> The locking that is done for the blacklist timing seems to degrade
> performance significantly Felix is under stress with multiple firing
> handler callbacks for each event.  I'd like to discuss an alternative
> approach with less locking that still  guarantees proper event ordering per
> the OSGi spec.  Basically instead of using the CyclicBarriers (Rendezvous)
> on a per handler basis we could use a count down latch to only await after
> all handlers are complete. Then instead of using a stopwatch based timer
> the JMX Current Thread Cpu Time which counts CPU time for the application
> code and any IO performed on it's behalf filtering out time context
> switching between threads to provide proper blacklisting.  I've created
> FELIX-4638 with a patch.
>
> Here are my test results.
>
> Baseline(Event Admin 1.4.2):
> 15 Threads
> 100000 Async Events per Thread
> 7 Active Handlers per Event
>
> For a total of 10500000 Handler Events Executed in 40000 - 45000ms
>
> With the same parameters above but a CountDownLatch I see the execution
> time drop to around 25000ms.   The improvement is noticeable because the
> stress test includes 7 active handlers per event.  The improvement is less
> noticeable with applications that only register one or 2 handlers for an
> active event such as in the PerformanceTestIT.  Thoughts on changing how
> this locking occurs? Concerns with using the JMX timings?
>
> - Bob
>
>


-- 
Carsten Ziegeler
Adobe Research Switzerland
cziegeler@apache.org