You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by David Bosschaert <da...@gmail.com> on 2015/05/13 15:02:01 UTC

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

I have implemented the performance improvements that I was thinking of
using Java 5 concurrency tools, they can be viewed at [1].

I wrote a little performance test suite [2] that tests multithreaded
service registry performance (10 threads) from single / multiple
bundles with either singleton services and Prototype Service Factory
services and the results are quite impressive. I'm getting performance
improvements compared to the current trunk from 8 times better than
the original (800%) to more than 30 times better (3000%).

Carsten has already reviewed the code (thanks Carsten!) and I'm
planning to commit it to Felix tomorrow if nobody objects.

Cheers,

David

[1] https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
[2] https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf

On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org> wrote:
> On 3/23/15 10:17 , David Bosschaert wrote:
>>
>> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org> wrote:
>>>
>>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>>>>
>>>> There's a call to interrupt() in Felix#acquireBundleLock(), not sure if
>>>> it
>>>> can be the culprit though.
>>>> Interrupts could also be caused by a bundle being shutdown while one of
>>>> its
>>>> thread is waiting for a service, which should is a valid use case imho.
>>>> Anyway, I think sanely reacting to a thread being interrupted would be
>>>> good.
>>>
>>>
>>> Yes, threads can be interrupted if they are holding a bundle lock and the
>>> global lock holder needs the bundle lock.
>>>
>>> I admit that I do not recall why we ignore the interrupt here, but didn't
>>> we
>>> implement service lookup so that a bundle lock wasn't necessary? I
>>> thought
>>> we just checked for the validity of the bundle context before returning
>>> or
>>> something. Perhaps we felt there was no reason to be interrupted in that
>>> case. I really don't know.
>>
>> I think that the Service Registry could be rewritten to be completely
>> free of synchronized blocks using the Java 5 concurrency libraries,
>
>
> Well, that just moves the sync blocks to the library, but yeah sure.
>
>> which I think would really be a better approach. There is too much
>> locking going on in the current SR implementation IMHO.
>
>
> I don't really think there is too much, but it is complicated.
> Unfortunately, it is complicated to make sure that locks aren't held while
> do service lookups and this is complicated because you can run into cycles,
> etc.
>
> But feel free to try to simplify it.
>
>>
>> This brings the question: can we move to Java 5 (or Java 6) for the
>> Framework codebase? AFAIK we're currently still JDK 1.4 compatible but
>> I would be surprised if there is anyone who still needs a JDK that
>> went end-of-life 7 years ago.
>
>
> At this point, it doesn't really matter to me.
>
> -> richard
>
>>
>> Best regards,
>>
>> David
>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
oops, I just realise that it's not making sense to include the SCR test
bundle in the benchmark tools, since it's not a concurrent test.

So for now, I have clues about where the problem may come from.


cheers;
/Pierre

On Thu, May 14, 2015 at 7:54 PM, Pierre De Rop <pi...@gmail.com>
wrote:

> Thanks David; I just gave a try, and indeed the parallel test passed. I
> observed a gain of around 7/10%. The tool is described in [1].
>
> But I only have 4 cores on my laptop and I will make more tests in my lab
> at work (next week) where we have some servers having 32 or even 128
> processors. This will give a better idea of the gain because the more
> processor you have, the more synchronization is costly, so I could possibly
> observe a better performance gain.
>
> Now, I'm sorry but I think that there is still a problem (I don't know
> where): when using more threads, the parallel test does not complete and
> stops with a timeout message, indicating that the number of expected
> components are not created after a timeout delay of 1 minute.
>
> So, I just committed a modified version of the tool in the sandbox which
> can now take a -Dthreads option in order to configure the number of
> threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
> not complete and ends with a timeout:
>
> $ java -Dthreads=10 -server -jar bin/felix.jar
>
> g! Starting benchmarks (each tested bundle will add/remove 630 components
> during bundle activation).
>
>         [Starting benchmarks with no processing done in components start
> methods]
>
> Benchmarking bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .................................................Could not start components
> timely: current start latch=2, stop latch=630
>
> My current understanding of this is that some components are still
> awaiting for unsatisfied service dependencies, just like if a service
> tracker would have missed a service registration.
>
> I ran the same test during two hours with the previous framework version,
> and did not observe any problems.
>
> I wonder if someone else do have another tool in order to perform another
> kind of load test, just to see if some problems are also observed.
>
> -> from  my side, I will do the following: in the past, the benchmark tool
> supported not only dependencymanager, but also Felix SCR and iPojo. So, I
> will reintroduce Felix SCR in the benchmark and will check if I also
> observe the problem (with -Dthreads=10).
>
> I will let you know.
>
> cheers;
> /Pierre
>
> [1]
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
>
> On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> I've fixed this now in
>> svn.apache.org/viewvc?view=revision&revision=1679367
>>
>> Pierre, your loadtest now runs to completion - thanks for reporting
>> this issue! I can see that the results for the parallel tests are a
>> little bit different than before, but I'm not sure how to read them so
>> I'll leave the interpretation of that to you :)
>>
>> Cheers,
>>
>> David
>>
>> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
>> wrote:
>> > I think I know what this is. I had some additional changes exactly in
>> > this area that I simply forgot to apply this morning. I should have it
>> > fixed sometime today.
>> >
>> > Cheers,
>> >
>> > David
>> >
>> > On 14 May 2015 at 14:03, David Bosschaert <da...@gmail.com>
>> wrote:
>> >> Hi Pierre,
>> >>
>> >> I'll take a look today.
>> >>
>> >> Cheers,
>> >>
>> >> David
>> >>
>> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
>> >>> I just committed the benchmark tool in
>> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
>> can
>> >>> take a look.
>> >>>
>> >>> To run the scenario:
>> >>>
>> >>> - install jdk8:
>> >>>
>> >>> [nxuser@nx0012 pderop]$ java -version
>> >>> java version "1.8.0_40"
>> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>> >>>
>> >>> - checkout the loadtest from
>> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>> >>>
>> >>> - go the the "loadtest" directory and start the test, just like this:
>> >>>
>> >>> $ java -server -jar bin/felix.jar
>> >>> Welcome to Apache Felix Gogo
>> >>>
>> >>> g! Starting benchmarks (each tested bundle will add/remove 630
>> components
>> >>> during bundle activation).
>> >>>
>> >>>         [Starting benchmarks with no processing done in components
>> start
>> >>> methods]
>> >>>
>> >>> Benchmarking bundle:
>> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>> >>> ..................................................
>> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
>> 319,631,722
>> >>> | 919,838,078]
>> >>>
>> >>> Benchmarking bundle:
>> >>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>> >>>
>> >>>
>> >>> Here, the first
>> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
>> >>> (single-threaded) passes OK. But the next one hangs
>> >>>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>> >>> it uses a fork join pool with size=4.
>> >>>
>> >>> and when typing "log warn", we see:
>> >>>
>> >>> "log warn"
>> >>>
>> >>> 2015.05.14 13:56:10 ERROR - Bundle:
>> >>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >>> java.util.ConcurrentModificationException
>> >>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >>>         at
>> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >>>         at
>> >>>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >>>         at
>> >>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >>>         at
>> >>>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >>>         at
>> >>>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >>>         at
>> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >>>         at
>> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >>>         at
>> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >>>
>> >>>
>> >>> (I will investigate also in my code to check if the problem does not
>> come
>> >>> from me ?)
>> >>>
>> >>> cheers;
>> >>> /Pierre
>> >>>
>> >>>
>> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
>> pierre.derop@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> Hi David,
>> >>>>
>> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
>> is a
>> >>>> regression somewhere in the framework, by my parallel test does not
>> pass
>> >>>> anymore.
>> >>>>
>> >>>> The test first starts with a single-threaded scenario, which passes
>> OK
>> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
>> then when
>> >>>> the parallel test starts
>> >>>>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>> >>>> it suddenly hangs, and when I type "log warn" under the gogo shell,
>> I see
>> >>>> the following exception:
>> >>>>
>> >>>> (I'm using java8):
>> >>>>
>> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>> >>>> ____________________________
>> >>>> Welcome to Apache Felix Gogo
>> >>>>
>> >>>> Benchmarking bundle:
>> >>>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>> >>>>
>> >>>> (here, the dependencymanager.parallel test hangs and when I type "log
>> >>>> warn", I see this:)
>> >>>>
>> >>>> g! log warn
>> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
>> >>>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >>>> java.util.ConcurrentModificationException
>> >>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >>>>         at
>> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >>>>         at
>> >>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >>>>         at
>> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >>>>         at
>> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >>>>
>> >>>> (If I configure my threadpool to 1, I have no problems, but with
>> >>>> threadpool=4, then I have the problem)
>> >>>>
>> >>>> I will investigate, but Ideally, may be it would be helpful if you
>> could
>> >>>> also run the test by yourself; so I will commit soon something to
>> reproduce
>> >>>> the problem in my sandbox.
>> >>>>
>> >>>> cheers;
>> >>>> /Pierre
>> >>>>
>> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>> >>>> david.bosschaert@gmail.com> wrote:
>> >>>>
>> >>>>> I've committed this now in
>> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>> >>>>>
>> >>>>> Curious to see what others are measuring. My tests were focused on
>> >>>>> multiple bundles/threads obtaining the same service, as that's were
>> I
>> >>>>> saw a bit of contention.
>> >>>>>
>> >>>>> Cheers,
>> >>>>>
>> >>>>> David
>> >>>>>
>> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
>> wrote:
>> >>>>> > Hi David,
>> >>>>> >
>> >>>>> > I'm looking forward to test your improvements using the
>> >>>>> dependencymanager
>> >>>>> > benchmark tool ([1]).
>> >>>>> >
>> >>>>> >
>> >>>>> > [1]
>> >>>>> >
>> >>>>>
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>> >>>>> >
>> >>>>> > /Pierre
>> >>>>> >
>> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>> >>>>> > david.bosschaert@gmail.com> wrote:
>> >>>>> >
>> >>>>> >> I have implemented the performance improvements that I was
>> thinking of
>> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>> >>>>> >>
>> >>>>> >> I wrote a little performance test suite [2] that tests
>> multithreaded
>> >>>>> >> service registry performance (10 threads) from single / multiple
>> >>>>> >> bundles with either singleton services and Prototype Service
>> Factory
>> >>>>> >> services and the results are quite impressive. I'm getting
>> performance
>> >>>>> >> improvements compared to the current trunk from 8 times better
>> than
>> >>>>> >> the original (800%) to more than 30 times better (3000%).
>> >>>>> >>
>> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>> >>>>> >>
>> >>>>> >> Cheers,
>> >>>>> >>
>> >>>>> >> David
>> >>>>> >>
>> >>>>> >> [1]
>> >>>>> >>
>> >>>>>
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> >>>>> >> [2]
>> >>>>> >>
>> >>>>>
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>> >>>>> >>
>> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <heavy@ungoverned.org
>> >
>> >>>>> wrote:
>> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >>>>> >> >>
>> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
>> heavy@ungoverned.org>
>> >>>>> >> wrote:
>> >>>>> >> >>>
>> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >>>>> >> >>>>
>> >>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(),
>> not
>> >>>>> sure
>> >>>>> >> if
>> >>>>> >> >>>> it
>> >>>>> >> >>>> can be the culprit though.
>> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
>> while
>> >>>>> one
>> >>>>> >> of
>> >>>>> >> >>>> its
>> >>>>> >> >>>> thread is waiting for a service, which should is a valid
>> use case
>> >>>>> >> imho.
>> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
>> interrupted
>> >>>>> would be
>> >>>>> >> >>>> good.
>> >>>>> >> >>>
>> >>>>> >> >>>
>> >>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle
>> lock
>> >>>>> and
>> >>>>> >> the
>> >>>>> >> >>> global lock holder needs the bundle lock.
>> >>>>> >> >>>
>> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
>> here, but
>> >>>>> >> didn't
>> >>>>> >> >>> we
>> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
>> necessary? I
>> >>>>> >> >>> thought
>> >>>>> >> >>> we just checked for the validity of the bundle context before
>> >>>>> returning
>> >>>>> >> >>> or
>> >>>>> >> >>> something. Perhaps we felt there was no reason to be
>> interrupted in
>> >>>>> >> that
>> >>>>> >> >>> case. I really don't know.
>> >>>>> >> >>
>> >>>>> >> >> I think that the Service Registry could be rewritten to be
>> >>>>> completely
>> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
>> libraries,
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > Well, that just moves the sync blocks to the library, but yeah
>> sure.
>> >>>>> >> >
>> >>>>> >> >> which I think would really be a better approach. There is too
>> much
>> >>>>> >> >> locking going on in the current SR implementation IMHO.
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > I don't really think there is too much, but it is complicated.
>> >>>>> >> > Unfortunately, it is complicated to make sure that locks
>> aren't held
>> >>>>> >> while
>> >>>>> >> > do service lookups and this is complicated because you can run
>> into
>> >>>>> >> cycles,
>> >>>>> >> > etc.
>> >>>>> >> >
>> >>>>> >> > But feel free to try to simplify it.
>> >>>>> >> >
>> >>>>> >> >>
>> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
>> for the
>> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
>> compatible
>> >>>>> but
>> >>>>> >> >> I would be surprised if there is anyone who still needs a JDK
>> that
>> >>>>> >> >> went end-of-life 7 years ago.
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > At this point, it doesn't really matter to me.
>> >>>>> >> >
>> >>>>> >> > -> richard
>> >>>>> >> >
>> >>>>> >> >>
>> >>>>> >> >> Best regards,
>> >>>>> >> >>
>> >>>>> >> >> David
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >>
>> >>>>>
>> >>>>
>> >>>>
>>
>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
ok, it's a bit late, I will continue tomorrow.

What I just found is that when the test fails, we are in the following
situation:
A DM component C1 that is part of the test remains inactive because it is
awaiting for a service dependency on C2.
But C2 is actually registered in the OSGi service registry (I verified it
using "inspect capability service" gogo command).

And it looks like the service tracker used by C1 to track C2 has never been
called in the addingService(C2).
That is why C1 remains inactive and this makes the test failing (I added
some debug code in dependency manager in order to verify this).

so, it will be difficult to make an integration test, but I think there is
still a problem somewhere in the framework.
I also ran the tests of DM and I have now 4 failing tests.

will continue to investigate tomorrow if I can.

cheers;
/Pierre



On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> Hi Pierre,
>
> It would indeed be useful to find out more about why your test is
> hanging. Maybe analysing a threaddump might give some more
> information?
>
> Cheers,
>
> David
>
> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
> > Thanks David; I just gave a try, and indeed the parallel test passed. I
> > observed a gain of around 7/10%. The tool is described in [1].
> >
> > But I only have 4 cores on my laptop and I will make more tests in my lab
> > at work (next week) where we have some servers having 32 or even 128
> > processors. This will give a better idea of the gain because the more
> > processor you have, the more synchronization is costly, so I could
> possibly
> > observe a better performance gain.
> >
> > Now, I'm sorry but I think that there is still a problem (I don't know
> > where): when using more threads, the parallel test does not complete and
> > stops with a timeout message, indicating that the number of expected
> > components are not created after a timeout delay of 1 minute.
> >
> > So, I just committed a modified version of the tool in the sandbox which
> > can now take a -Dthreads option in order to configure the number of
> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
> > not complete and ends with a timeout:
> >
> > $ java -Dthreads=10 -server -jar bin/felix.jar
> >
> > g! Starting benchmarks (each tested bundle will add/remove 630 components
> > during bundle activation).
> >
> >         [Starting benchmarks with no processing done in components start
> > methods]
> >
> > Benchmarking bundle:
> > org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> > .................................................Could not start
> components
> > timely: current start latch=2, stop latch=630
> >
> > My current understanding of this is that some components are still
> awaiting
> > for unsatisfied service dependencies, just like if a service tracker
> would
> > have missed a service registration.
> >
> > I ran the same test during two hours with the previous framework version,
> > and did not observe any problems.
> >
> > I wonder if someone else do have another tool in order to perform another
> > kind of load test, just to see if some problems are also observed.
> >
> > -> from  my side, I will do the following: in the past, the benchmark
> tool
> > supported not only dependencymanager, but also Felix SCR and iPojo. So, I
> > will reintroduce Felix SCR in the benchmark and will check if I also
> > observe the problem (with -Dthreads=10).
> >
> > I will let you know.
> >
> > cheers;
> > /Pierre
> >
> > [1]
> >
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
> >
> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> I've fixed this now in
> >> svn.apache.org/viewvc?view=revision&revision=1679367
> >>
> >> Pierre, your loadtest now runs to completion - thanks for reporting
> >> this issue! I can see that the results for the parallel tests are a
> >> little bit different than before, but I'm not sure how to read them so
> >> I'll leave the interpretation of that to you :)
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
> >> wrote:
> >> > I think I know what this is. I had some additional changes exactly in
> >> > this area that I simply forgot to apply this morning. I should have it
> >> > fixed sometime today.
> >> >
> >> > Cheers,
> >> >
> >> > David
> >> >
> >> > On 14 May 2015 at 14:03, David Bosschaert <david.bosschaert@gmail.com
> >
> >> wrote:
> >> >> Hi Pierre,
> >> >>
> >> >> I'll take a look today.
> >> >>
> >> >> Cheers,
> >> >>
> >> >> David
> >> >>
> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
> wrote:
> >> >>> I just committed the benchmark tool in
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
> >> can
> >> >>> take a look.
> >> >>>
> >> >>> To run the scenario:
> >> >>>
> >> >>> - install jdk8:
> >> >>>
> >> >>> [nxuser@nx0012 pderop]$ java -version
> >> >>> java version "1.8.0_40"
> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >> >>>
> >> >>> - checkout the loadtest from
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >> >>>
> >> >>> - go the the "loadtest" directory and start the test, just like
> this:
> >> >>>
> >> >>> $ java -server -jar bin/felix.jar
> >> >>> Welcome to Apache Felix Gogo
> >> >>>
> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> >> components
> >> >>> during bundle activation).
> >> >>>
> >> >>>         [Starting benchmarks with no processing done in components
> >> start
> >> >>> methods]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >> >>> ..................................................
> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> >> 319,631,722
> >> >>> | 919,838,078]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>
> >> >>>
> >> >>> Here, the first
> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
> test
> >> >>> (single-threaded) passes OK. But the next one hangs
> >> >>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >> >>> it uses a fork join pool with size=4.
> >> >>>
> >> >>> and when typing "log warn", we see:
> >> >>>
> >> >>> "log warn"
> >> >>>
> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>> java.util.ConcurrentModificationException
> >> >>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>         at
> >> >>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>         at
> >> >>>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>         at
> >> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>         at
> >> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>
> >> >>>
> >> >>> (I will investigate also in my code to check if the problem does not
> >> come
> >> >>> from me ?)
> >> >>>
> >> >>> cheers;
> >> >>> /Pierre
> >> >>>
> >> >>>
> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
> pierre.derop@gmail.com
> >> >
> >> >>> wrote:
> >> >>>
> >> >>>> Hi David,
> >> >>>>
> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
> >> is a
> >> >>>> regression somewhere in the framework, by my parallel test does not
> >> pass
> >> >>>> anymore.
> >> >>>>
> >> >>>> The test first starts with a single-threaded scenario, which
> passes OK
> >> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> >> then when
> >> >>>> the parallel test starts
> >> >>>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
> shell, I
> >> see
> >> >>>> the following exception:
> >> >>>>
> >> >>>> (I'm using java8):
> >> >>>>
> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >> >>>> ____________________________
> >> >>>> Welcome to Apache Felix Gogo
> >> >>>>
> >> >>>> Benchmarking bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>>
> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
> "log
> >> >>>> warn", I see this:)
> >> >>>>
> >> >>>> g! log warn
> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>>> java.util.ConcurrentModificationException
> >> >>>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>>         at
> >> >>>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>>         at
> >> >>>>
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>>         at
> >> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>>
> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >> >>>> threadpool=4, then I have the problem)
> >> >>>>
> >> >>>> I will investigate, but Ideally, may be it would be helpful if you
> >> could
> >> >>>> also run the test by yourself; so I will commit soon something to
> >> reproduce
> >> >>>> the problem in my sandbox.
> >> >>>>
> >> >>>> cheers;
> >> >>>> /Pierre
> >> >>>>
> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >> >>>> david.bosschaert@gmail.com> wrote:
> >> >>>>
> >> >>>>> I've committed this now in
> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >> >>>>>
> >> >>>>> Curious to see what others are measuring. My tests were focused on
> >> >>>>> multiple bundles/threads obtaining the same service, as that's
> were I
> >> >>>>> saw a bit of contention.
> >> >>>>>
> >> >>>>> Cheers,
> >> >>>>>
> >> >>>>> David
> >> >>>>>
> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
> >> wrote:
> >> >>>>> > Hi David,
> >> >>>>> >
> >> >>>>> > I'm looking forward to test your improvements using the
> >> >>>>> dependencymanager
> >> >>>>> > benchmark tool ([1]).
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > [1]
> >> >>>>> >
> >> >>>>>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >> >>>>> >
> >> >>>>> > /Pierre
> >> >>>>> >
> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >> >>>>> > david.bosschaert@gmail.com> wrote:
> >> >>>>> >
> >> >>>>> >> I have implemented the performance improvements that I was
> >> thinking of
> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >> >>>>> >>
> >> >>>>> >> I wrote a little performance test suite [2] that tests
> >> multithreaded
> >> >>>>> >> service registry performance (10 threads) from single /
> multiple
> >> >>>>> >> bundles with either singleton services and Prototype Service
> >> Factory
> >> >>>>> >> services and the results are quite impressive. I'm getting
> >> performance
> >> >>>>> >> improvements compared to the current trunk from 8 times better
> >> than
> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >> >>>>> >>
> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >> >>>>> >>
> >> >>>>> >> Cheers,
> >> >>>>> >>
> >> >>>>> >> David
> >> >>>>> >>
> >> >>>>> >> [1]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> >>>>> >> [2]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >> >>>>> >>
> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
> heavy@ungoverned.org>
> >> >>>>> wrote:
> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >>>>> >> >>
> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> >> heavy@ungoverned.org>
> >> >>>>> >> wrote:
> >> >>>>> >> >>>
> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >>>>> >> >>>>
> >> >>>>> >> >>>> There's a call to interrupt() in
> Felix#acquireBundleLock(),
> >> not
> >> >>>>> sure
> >> >>>>> >> if
> >> >>>>> >> >>>> it
> >> >>>>> >> >>>> can be the culprit though.
> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
> >> while
> >> >>>>> one
> >> >>>>> >> of
> >> >>>>> >> >>>> its
> >> >>>>> >> >>>> thread is waiting for a service, which should is a valid
> use
> >> case
> >> >>>>> >> imho.
> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
> interrupted
> >> >>>>> would be
> >> >>>>> >> >>>> good.
> >> >>>>> >> >>>
> >> >>>>> >> >>>
> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
> bundle
> >> lock
> >> >>>>> and
> >> >>>>> >> the
> >> >>>>> >> >>> global lock holder needs the bundle lock.
> >> >>>>> >> >>>
> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
> >> here, but
> >> >>>>> >> didn't
> >> >>>>> >> >>> we
> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> >> necessary? I
> >> >>>>> >> >>> thought
> >> >>>>> >> >>> we just checked for the validity of the bundle context
> before
> >> >>>>> returning
> >> >>>>> >> >>> or
> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> >> interrupted in
> >> >>>>> >> that
> >> >>>>> >> >>> case. I really don't know.
> >> >>>>> >> >>
> >> >>>>> >> >> I think that the Service Registry could be rewritten to be
> >> >>>>> completely
> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> >> libraries,
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
> yeah
> >> sure.
> >> >>>>> >> >
> >> >>>>> >> >> which I think would really be a better approach. There is
> too
> >> much
> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > I don't really think there is too much, but it is
> complicated.
> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
> aren't
> >> held
> >> >>>>> >> while
> >> >>>>> >> > do service lookups and this is complicated because you can
> run
> >> into
> >> >>>>> >> cycles,
> >> >>>>> >> > etc.
> >> >>>>> >> >
> >> >>>>> >> > But feel free to try to simplify it.
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
> >> for the
> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
> >> compatible
> >> >>>>> but
> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
> JDK
> >> that
> >> >>>>> >> >> went end-of-life 7 years ago.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > At this point, it doesn't really matter to me.
> >> >>>>> >> >
> >> >>>>> >> > -> richard
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> Best regards,
> >> >>>>> >> >>
> >> >>>>> >> >> David
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >>
> >> >>>>>
> >> >>>>
> >> >>>>
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
Yes David, indeed our scenarios are different. In mine, I'm measuring the
assembly of components using parallel dependency manager where components
are concurrently activated, registered, and bound with each other, using a
shared thread pool.

cheers;
/Pierre




On Tue, May 19, 2015 at 3:40 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> Hi Pierre,
>
> Good to hear that the problem is now gone.
> I guess the performance improvement measured hugely depends on what
> you are testing. My test focuses on multiple clients/threads/bundles
> accessing the same service (either singleton or PSF) in a very raw
> manner (via ctx.getServiceReference()).
> Good to hear that you're still seeing perf improvements but I guess
> your test exercises a number of other components as well (e.g.
> Dependency Manager) possibly using multiple service registrations, so
> that could very well explain some of the differences in our results...
>
> Cheers,
>
> David
>
> On 19 May 2015 at 13:32, Pierre De Rop <pi...@gmail.com> wrote:
> > Hi David,
> >
> > Excellent.
> >
> > I'm glad to confirm that the issue is resolved, and my DM loader is now
> > running seamlessly.
> > I'm observing an overall gain of 16% compared to the previous 5.0.0.
> > (but this has to be taken with care,because I only made a quick test).
> >
> > I did not have time but I guess I could observe a better performance gain
> > on a bigger host with more cpu (I only have four); since synchronization
> > cost is usually proportional to the number of available cores and as I
> > understand your fix is now based on java.util.concurrent jdk tools.
> >
> >
> > many thanks
> > /Pierre
> >
> > On Tue, May 19, 2015 at 1:57 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> Thanks Pierre for submitting a unit test to FELIX-4866 that helped me
> >> enormously in identifying the issue.
> >>
> >> I have fixed the bug in my code (without degrading performance) and at
> >> least your concurrency test, my concurrency tests and all the
> >> framework unit tests now consistently pass. I would be very interested
> >> in hearing whether your bigger test suit also still behaves as
> >> expected.
> >>
> >> Best regards,
> >>
> >> David
> >>
> >> On 14 May 2015 at 22:53, Pierre De Rop <pi...@gmail.com> wrote:
> >> > the threadump did not help.
> >> > I will  investigate (may be a bug somewhere in my part; if this is the
> >> > case, I would be sorry to make all this noise).
> >> >
> >> > hope to let you know soon.
> >> >
> >> > by the way, do you know how to run the SCR integration tests with the
> >> > framework from the trunk ? I know that there are some SCR integration
> >> tests
> >> > that are doing some load tests, and I would be interested to know if
> they
> >> > are also ok with the framework from the trunk ?
> >> >
> >> > cheers;
> >> > /Pierre
> >> >
> >> >
> >> > On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
> >> > david.bosschaert@gmail.com> wrote:
> >> >
> >> >> Hi Pierre,
> >> >>
> >> >> It would indeed be useful to find out more about why your test is
> >> >> hanging. Maybe analysing a threaddump might give some more
> >> >> information?
> >> >>
> >> >> Cheers,
> >> >>
> >> >> David
> >> >>
> >> >> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com>
> wrote:
> >> >> > Thanks David; I just gave a try, and indeed the parallel test
> passed.
> >> I
> >> >> > observed a gain of around 7/10%. The tool is described in [1].
> >> >> >
> >> >> > But I only have 4 cores on my laptop and I will make more tests in
> my
> >> lab
> >> >> > at work (next week) where we have some servers having 32 or even
> 128
> >> >> > processors. This will give a better idea of the gain because the
> more
> >> >> > processor you have, the more synchronization is costly, so I could
> >> >> possibly
> >> >> > observe a better performance gain.
> >> >> >
> >> >> > Now, I'm sorry but I think that there is still a problem (I don't
> know
> >> >> > where): when using more threads, the parallel test does not
> complete
> >> and
> >> >> > stops with a timeout message, indicating that the number of
> expected
> >> >> > components are not created after a timeout delay of 1 minute.
> >> >> >
> >> >> > So, I just committed a modified version of the tool in the sandbox
> >> which
> >> >> > can now take a -Dthreads option in order to configure the number of
> >> >> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test
> >> does
> >> >> > not complete and ends with a timeout:
> >> >> >
> >> >> > $ java -Dthreads=10 -server -jar bin/felix.jar
> >> >> >
> >> >> > g! Starting benchmarks (each tested bundle will add/remove 630
> >> components
> >> >> > during bundle activation).
> >> >> >
> >> >> >         [Starting benchmarks with no processing done in components
> >> start
> >> >> > methods]
> >> >> >
> >> >> > Benchmarking bundle:
> >> >> >
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> >> > .................................................Could not start
> >> >> components
> >> >> > timely: current start latch=2, stop latch=630
> >> >> >
> >> >> > My current understanding of this is that some components are still
> >> >> awaiting
> >> >> > for unsatisfied service dependencies, just like if a service
> tracker
> >> >> would
> >> >> > have missed a service registration.
> >> >> >
> >> >> > I ran the same test during two hours with the previous framework
> >> version,
> >> >> > and did not observe any problems.
> >> >> >
> >> >> > I wonder if someone else do have another tool in order to perform
> >> another
> >> >> > kind of load test, just to see if some problems are also observed.
> >> >> >
> >> >> > -> from  my side, I will do the following: in the past, the
> benchmark
> >> >> tool
> >> >> > supported not only dependencymanager, but also Felix SCR and iPojo.
> >> So, I
> >> >> > will reintroduce Felix SCR in the benchmark and will check if I
> also
> >> >> > observe the problem (with -Dthreads=10).
> >> >> >
> >> >> > I will let you know.
> >> >> >
> >> >> > cheers;
> >> >> > /Pierre
> >> >> >
> >> >> > [1]
> >> >> >
> >> >>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
> >> >> >
> >> >> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> >> >> > david.bosschaert@gmail.com> wrote:
> >> >> >
> >> >> >> I've fixed this now in
> >> >> >> svn.apache.org/viewvc?view=revision&revision=1679367
> >> >> >>
> >> >> >> Pierre, your loadtest now runs to completion - thanks for
> reporting
> >> >> >> this issue! I can see that the results for the parallel tests are
> a
> >> >> >> little bit different than before, but I'm not sure how to read
> them
> >> so
> >> >> >> I'll leave the interpretation of that to you :)
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> David
> >> >> >>
> >> >> >> On 14 May 2015 at 14:38, David Bosschaert <
> >> david.bosschaert@gmail.com>
> >> >> >> wrote:
> >> >> >> > I think I know what this is. I had some additional changes
> exactly
> >> in
> >> >> >> > this area that I simply forgot to apply this morning. I should
> >> have it
> >> >> >> > fixed sometime today.
> >> >> >> >
> >> >> >> > Cheers,
> >> >> >> >
> >> >> >> > David
> >> >> >> >
> >> >> >> > On 14 May 2015 at 14:03, David Bosschaert <
> >> david.bosschaert@gmail.com
> >> >> >
> >> >> >> wrote:
> >> >> >> >> Hi Pierre,
> >> >> >> >>
> >> >> >> >> I'll take a look today.
> >> >> >> >>
> >> >> >> >> Cheers,
> >> >> >> >>
> >> >> >> >> David
> >> >> >> >>
> >> >> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pierre.derop@gmail.com
> >
> >> >> wrote:
> >> >> >> >>> I just committed the benchmark tool in
> >> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/,
> if
> >> you
> >> >> >> can
> >> >> >> >>> take a look.
> >> >> >> >>>
> >> >> >> >>> To run the scenario:
> >> >> >> >>>
> >> >> >> >>> - install jdk8:
> >> >> >> >>>
> >> >> >> >>> [nxuser@nx0012 pderop]$ java -version
> >> >> >> >>> java version "1.8.0_40"
> >> >> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >> >> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed
> mode)
> >> >> >> >>>
> >> >> >> >>> - checkout the loadtest from
> >> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >> >> >> >>>
> >> >> >> >>> - go the the "loadtest" directory and start the test, just
> like
> >> >> this:
> >> >> >> >>>
> >> >> >> >>> $ java -server -jar bin/felix.jar
> >> >> >> >>> Welcome to Apache Felix Gogo
> >> >> >> >>>
> >> >> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> >> >> >> components
> >> >> >> >>> during bundle activation).
> >> >> >> >>>
> >> >> >> >>>         [Starting benchmarks with no processing done in
> >> components
> >> >> >> start
> >> >> >> >>> methods]
> >> >> >> >>>
> >> >> >> >>> Benchmarking bundle:
> >> >> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >> >> >> >>> ..................................................
> >> >> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581
> |
> >> >> >> 319,631,722
> >> >> >> >>> | 919,838,078]
> >> >> >> >>>
> >> >> >> >>> Benchmarking bundle:
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> >> .
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>> Here, the first
> >> >> >> >>>
> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
> >> >> test
> >> >> >> >>> (single-threaded) passes OK. But the next one hangs
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >> >> >> >>> it uses a fork join pool with size=4.
> >> >> >> >>>
> >> >> >> >>> and when typing "log warn", we see:
> >> >> >> >>>
> >> >> >> >>> "log warn"
> >> >> >> >>>
> >> >> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> >> -
> >> >> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >> >> >>> java.util.ConcurrentModificationException
> >> >> >> >>>         at
> >> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >> >> >>>         at
> java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >> >> >>>         at
> >> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >> >> >>>         at
> >> >> >> >>>
> >> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >> >> >>>         at
> >> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >> >> >>>         at
> >> >> >> >>>
> >> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >> >> >>>         at
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>> (I will investigate also in my code to check if the problem
> does
> >> not
> >> >> >> come
> >> >> >> >>> from me ?)
> >> >> >> >>>
> >> >> >> >>> cheers;
> >> >> >> >>> /Pierre
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
> >> >> pierre.derop@gmail.com
> >> >> >> >
> >> >> >> >>> wrote:
> >> >> >> >>>
> >> >> >> >>>> Hi David,
> >> >> >> >>>>
> >> >> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if
> >> there
> >> >> >> is a
> >> >> >> >>>> regression somewhere in the framework, by my parallel test
> does
> >> not
> >> >> >> pass
> >> >> >> >>>> anymore.
> >> >> >> >>>>
> >> >> >> >>>> The test first starts with a single-threaded scenario, which
> >> >> passes OK
> >> >> >> >>>>
> >> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> >> >> >> then when
> >> >> >> >>>> the parallel test starts
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >> >> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
> >> >> shell, I
> >> >> >> see
> >> >> >> >>>> the following exception:
> >> >> >> >>>>
> >> >> >> >>>> (I'm using java8):
> >> >> >> >>>>
> >> >> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >> >> >> >>>> ____________________________
> >> >> >> >>>> Welcome to Apache Felix Gogo
> >> >> >> >>>>
> >> >> >> >>>> Benchmarking bundle:
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> >> .
> >> >> >> >>>>
> >> >> >> >>>> (here, the dependencymanager.parallel test hangs and when I
> type
> >> >> "log
> >> >> >> >>>> warn", I see this:)
> >> >> >> >>>>
> >> >> >> >>>> g! log warn
> >> >> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> >> -
> >> >> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >> >> >>>> java.util.ConcurrentModificationException
> >> >> >> >>>>         at
> >> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >> >> >>>>         at
> java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >> >> >>>>         at
> >> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >> >> >>>>         at
> >> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >> >> >>>>         at
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >> >> >>>>
> >> >> >> >>>> (If I configure my threadpool to 1, I have no problems, but
> with
> >> >> >> >>>> threadpool=4, then I have the problem)
> >> >> >> >>>>
> >> >> >> >>>> I will investigate, but Ideally, may be it would be helpful
> if
> >> you
> >> >> >> could
> >> >> >> >>>> also run the test by yourself; so I will commit soon
> something
> >> to
> >> >> >> reproduce
> >> >> >> >>>> the problem in my sandbox.
> >> >> >> >>>>
> >> >> >> >>>> cheers;
> >> >> >> >>>> /Pierre
> >> >> >> >>>>
> >> >> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >> >> >> >>>> david.bosschaert@gmail.com> wrote:
> >> >> >> >>>>
> >> >> >> >>>>> I've committed this now in
> >> >> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >> >> >> >>>>>
> >> >> >> >>>>> Curious to see what others are measuring. My tests were
> >> focused on
> >> >> >> >>>>> multiple bundles/threads obtaining the same service, as
> that's
> >> >> were I
> >> >> >> >>>>> saw a bit of contention.
> >> >> >> >>>>>
> >> >> >> >>>>> Cheers,
> >> >> >> >>>>>
> >> >> >> >>>>> David
> >> >> >> >>>>>
> >> >> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <
> pierre.derop@gmail.com
> >> >
> >> >> >> wrote:
> >> >> >> >>>>> > Hi David,
> >> >> >> >>>>> >
> >> >> >> >>>>> > I'm looking forward to test your improvements using the
> >> >> >> >>>>> dependencymanager
> >> >> >> >>>>> > benchmark tool ([1]).
> >> >> >> >>>>> >
> >> >> >> >>>>> >
> >> >> >> >>>>> > [1]
> >> >> >> >>>>> >
> >> >> >> >>>>>
> >> >> >>
> >> >>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >> >> >> >>>>> >
> >> >> >> >>>>> > /Pierre
> >> >> >> >>>>> >
> >> >> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >> >> >> >>>>> > david.bosschaert@gmail.com> wrote:
> >> >> >> >>>>> >
> >> >> >> >>>>> >> I have implemented the performance improvements that I
> was
> >> >> >> thinking of
> >> >> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at
> [1].
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> I wrote a little performance test suite [2] that tests
> >> >> >> multithreaded
> >> >> >> >>>>> >> service registry performance (10 threads) from single /
> >> >> multiple
> >> >> >> >>>>> >> bundles with either singleton services and Prototype
> Service
> >> >> >> Factory
> >> >> >> >>>>> >> services and the results are quite impressive. I'm
> getting
> >> >> >> performance
> >> >> >> >>>>> >> improvements compared to the current trunk from 8 times
> >> better
> >> >> >> than
> >> >> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!)
> and
> >> I'm
> >> >> >> >>>>> >> planning to commit it to Felix tomorrow if nobody
> objects.
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> Cheers,
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> David
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> [1]
> >> >> >> >>>>> >>
> >> >> >> >>>>>
> >> >> >>
> >> >>
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> >> >> >>>>> >> [2]
> >> >> >> >>>>> >>
> >> >> >> >>>>>
> >> >> >>
> >> >>
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >> >> >> >>>>> >>
> >> >> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
> >> >> heavy@ungoverned.org>
> >> >> >> >>>>> wrote:
> >> >> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >> >> >>>>> >> >>
> >> >> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> >> >> >> heavy@ungoverned.org>
> >> >> >> >>>>> >> wrote:
> >> >> >> >>>>> >> >>>
> >> >> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >> >> >>>>> >> >>>>
> >> >> >> >>>>> >> >>>> There's a call to interrupt() in
> >> >> Felix#acquireBundleLock(),
> >> >> >> not
> >> >> >> >>>>> sure
> >> >> >> >>>>> >> if
> >> >> >> >>>>> >> >>>> it
> >> >> >> >>>>> >> >>>> can be the culprit though.
> >> >> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being
> >> shutdown
> >> >> >> while
> >> >> >> >>>>> one
> >> >> >> >>>>> >> of
> >> >> >> >>>>> >> >>>> its
> >> >> >> >>>>> >> >>>> thread is waiting for a service, which should is a
> >> valid
> >> >> use
> >> >> >> case
> >> >> >> >>>>> >> imho.
> >> >> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
> >> >> interrupted
> >> >> >> >>>>> would be
> >> >> >> >>>>> >> >>>> good.
> >> >> >> >>>>> >> >>>
> >> >> >> >>>>> >> >>>
> >> >> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
> >> >> bundle
> >> >> >> lock
> >> >> >> >>>>> and
> >> >> >> >>>>> >> the
> >> >> >> >>>>> >> >>> global lock holder needs the bundle lock.
> >> >> >> >>>>> >> >>>
> >> >> >> >>>>> >> >>> I admit that I do not recall why we ignore the
> interrupt
> >> >> >> here, but
> >> >> >> >>>>> >> didn't
> >> >> >> >>>>> >> >>> we
> >> >> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> >> >> >> necessary? I
> >> >> >> >>>>> >> >>> thought
> >> >> >> >>>>> >> >>> we just checked for the validity of the bundle
> context
> >> >> before
> >> >> >> >>>>> returning
> >> >> >> >>>>> >> >>> or
> >> >> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> >> >> >> interrupted in
> >> >> >> >>>>> >> that
> >> >> >> >>>>> >> >>> case. I really don't know.
> >> >> >> >>>>> >> >>
> >> >> >> >>>>> >> >> I think that the Service Registry could be rewritten
> to
> >> be
> >> >> >> >>>>> completely
> >> >> >> >>>>> >> >> free of synchronized blocks using the Java 5
> concurrency
> >> >> >> libraries,
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> > Well, that just moves the sync blocks to the library,
> but
> >> >> yeah
> >> >> >> sure.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >> which I think would really be a better approach.
> There is
> >> >> too
> >> >> >> much
> >> >> >> >>>>> >> >> locking going on in the current SR implementation
> IMHO.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> > I don't really think there is too much, but it is
> >> >> complicated.
> >> >> >> >>>>> >> > Unfortunately, it is complicated to make sure that
> locks
> >> >> aren't
> >> >> >> held
> >> >> >> >>>>> >> while
> >> >> >> >>>>> >> > do service lookups and this is complicated because you
> can
> >> >> run
> >> >> >> into
> >> >> >> >>>>> >> cycles,
> >> >> >> >>>>> >> > etc.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> > But feel free to try to simplify it.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >>
> >> >> >> >>>>> >> >> This brings the question: can we move to Java 5 (or
> Java
> >> 6)
> >> >> >> for the
> >> >> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK
> 1.4
> >> >> >> compatible
> >> >> >> >>>>> but
> >> >> >> >>>>> >> >> I would be surprised if there is anyone who still
> needs a
> >> >> JDK
> >> >> >> that
> >> >> >> >>>>> >> >> went end-of-life 7 years ago.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> > At this point, it doesn't really matter to me.
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> > -> richard
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >>
> >> >> >> >>>>> >> >> Best regards,
> >> >> >> >>>>> >> >>
> >> >> >> >>>>> >> >> David
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >> >
> >> >> >> >>>>> >>
> >> >> >> >>>>>
> >> >> >> >>>>
> >> >> >> >>>>
> >> >> >>
> >> >>
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
Hi Pierre,

Good to hear that the problem is now gone.
I guess the performance improvement measured hugely depends on what
you are testing. My test focuses on multiple clients/threads/bundles
accessing the same service (either singleton or PSF) in a very raw
manner (via ctx.getServiceReference()).
Good to hear that you're still seeing perf improvements but I guess
your test exercises a number of other components as well (e.g.
Dependency Manager) possibly using multiple service registrations, so
that could very well explain some of the differences in our results...

Cheers,

David

On 19 May 2015 at 13:32, Pierre De Rop <pi...@gmail.com> wrote:
> Hi David,
>
> Excellent.
>
> I'm glad to confirm that the issue is resolved, and my DM loader is now
> running seamlessly.
> I'm observing an overall gain of 16% compared to the previous 5.0.0.
> (but this has to be taken with care,because I only made a quick test).
>
> I did not have time but I guess I could observe a better performance gain
> on a bigger host with more cpu (I only have four); since synchronization
> cost is usually proportional to the number of available cores and as I
> understand your fix is now based on java.util.concurrent jdk tools.
>
>
> many thanks
> /Pierre
>
> On Tue, May 19, 2015 at 1:57 PM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> Thanks Pierre for submitting a unit test to FELIX-4866 that helped me
>> enormously in identifying the issue.
>>
>> I have fixed the bug in my code (without degrading performance) and at
>> least your concurrency test, my concurrency tests and all the
>> framework unit tests now consistently pass. I would be very interested
>> in hearing whether your bigger test suit also still behaves as
>> expected.
>>
>> Best regards,
>>
>> David
>>
>> On 14 May 2015 at 22:53, Pierre De Rop <pi...@gmail.com> wrote:
>> > the threadump did not help.
>> > I will  investigate (may be a bug somewhere in my part; if this is the
>> > case, I would be sorry to make all this noise).
>> >
>> > hope to let you know soon.
>> >
>> > by the way, do you know how to run the SCR integration tests with the
>> > framework from the trunk ? I know that there are some SCR integration
>> tests
>> > that are doing some load tests, and I would be interested to know if they
>> > are also ok with the framework from the trunk ?
>> >
>> > cheers;
>> > /Pierre
>> >
>> >
>> > On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
>> > david.bosschaert@gmail.com> wrote:
>> >
>> >> Hi Pierre,
>> >>
>> >> It would indeed be useful to find out more about why your test is
>> >> hanging. Maybe analysing a threaddump might give some more
>> >> information?
>> >>
>> >> Cheers,
>> >>
>> >> David
>> >>
>> >> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
>> >> > Thanks David; I just gave a try, and indeed the parallel test passed.
>> I
>> >> > observed a gain of around 7/10%. The tool is described in [1].
>> >> >
>> >> > But I only have 4 cores on my laptop and I will make more tests in my
>> lab
>> >> > at work (next week) where we have some servers having 32 or even 128
>> >> > processors. This will give a better idea of the gain because the more
>> >> > processor you have, the more synchronization is costly, so I could
>> >> possibly
>> >> > observe a better performance gain.
>> >> >
>> >> > Now, I'm sorry but I think that there is still a problem (I don't know
>> >> > where): when using more threads, the parallel test does not complete
>> and
>> >> > stops with a timeout message, indicating that the number of expected
>> >> > components are not created after a timeout delay of 1 minute.
>> >> >
>> >> > So, I just committed a modified version of the tool in the sandbox
>> which
>> >> > can now take a -Dthreads option in order to configure the number of
>> >> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test
>> does
>> >> > not complete and ends with a timeout:
>> >> >
>> >> > $ java -Dthreads=10 -server -jar bin/felix.jar
>> >> >
>> >> > g! Starting benchmarks (each tested bundle will add/remove 630
>> components
>> >> > during bundle activation).
>> >> >
>> >> >         [Starting benchmarks with no processing done in components
>> start
>> >> > methods]
>> >> >
>> >> > Benchmarking bundle:
>> >> >
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> >> > .................................................Could not start
>> >> components
>> >> > timely: current start latch=2, stop latch=630
>> >> >
>> >> > My current understanding of this is that some components are still
>> >> awaiting
>> >> > for unsatisfied service dependencies, just like if a service tracker
>> >> would
>> >> > have missed a service registration.
>> >> >
>> >> > I ran the same test during two hours with the previous framework
>> version,
>> >> > and did not observe any problems.
>> >> >
>> >> > I wonder if someone else do have another tool in order to perform
>> another
>> >> > kind of load test, just to see if some problems are also observed.
>> >> >
>> >> > -> from  my side, I will do the following: in the past, the benchmark
>> >> tool
>> >> > supported not only dependencymanager, but also Felix SCR and iPojo.
>> So, I
>> >> > will reintroduce Felix SCR in the benchmark and will check if I also
>> >> > observe the problem (with -Dthreads=10).
>> >> >
>> >> > I will let you know.
>> >> >
>> >> > cheers;
>> >> > /Pierre
>> >> >
>> >> > [1]
>> >> >
>> >>
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
>> >> >
>> >> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
>> >> > david.bosschaert@gmail.com> wrote:
>> >> >
>> >> >> I've fixed this now in
>> >> >> svn.apache.org/viewvc?view=revision&revision=1679367
>> >> >>
>> >> >> Pierre, your loadtest now runs to completion - thanks for reporting
>> >> >> this issue! I can see that the results for the parallel tests are a
>> >> >> little bit different than before, but I'm not sure how to read them
>> so
>> >> >> I'll leave the interpretation of that to you :)
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> David
>> >> >>
>> >> >> On 14 May 2015 at 14:38, David Bosschaert <
>> david.bosschaert@gmail.com>
>> >> >> wrote:
>> >> >> > I think I know what this is. I had some additional changes exactly
>> in
>> >> >> > this area that I simply forgot to apply this morning. I should
>> have it
>> >> >> > fixed sometime today.
>> >> >> >
>> >> >> > Cheers,
>> >> >> >
>> >> >> > David
>> >> >> >
>> >> >> > On 14 May 2015 at 14:03, David Bosschaert <
>> david.bosschaert@gmail.com
>> >> >
>> >> >> wrote:
>> >> >> >> Hi Pierre,
>> >> >> >>
>> >> >> >> I'll take a look today.
>> >> >> >>
>> >> >> >> Cheers,
>> >> >> >>
>> >> >> >> David
>> >> >> >>
>> >> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
>> >> wrote:
>> >> >> >>> I just committed the benchmark tool in
>> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if
>> you
>> >> >> can
>> >> >> >>> take a look.
>> >> >> >>>
>> >> >> >>> To run the scenario:
>> >> >> >>>
>> >> >> >>> - install jdk8:
>> >> >> >>>
>> >> >> >>> [nxuser@nx0012 pderop]$ java -version
>> >> >> >>> java version "1.8.0_40"
>> >> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>> >> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>> >> >> >>>
>> >> >> >>> - checkout the loadtest from
>> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>> >> >> >>>
>> >> >> >>> - go the the "loadtest" directory and start the test, just like
>> >> this:
>> >> >> >>>
>> >> >> >>> $ java -server -jar bin/felix.jar
>> >> >> >>> Welcome to Apache Felix Gogo
>> >> >> >>>
>> >> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
>> >> >> components
>> >> >> >>> during bundle activation).
>> >> >> >>>
>> >> >> >>>         [Starting benchmarks with no processing done in
>> components
>> >> >> start
>> >> >> >>> methods]
>> >> >> >>>
>> >> >> >>> Benchmarking bundle:
>> >> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>> >> >> >>> ..................................................
>> >> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
>> >> >> 319,631,722
>> >> >> >>> | 919,838,078]
>> >> >> >>>
>> >> >> >>> Benchmarking bundle:
>> >> >> >>>
>> >> >>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> >> .
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> Here, the first
>> >> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
>> >> test
>> >> >> >>> (single-threaded) passes OK. But the next one hangs
>> >> >> >>>
>> >> >>
>> >>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>> >> >> >>> it uses a fork join pool with size=4.
>> >> >> >>>
>> >> >> >>> and when typing "log warn", we see:
>> >> >> >>>
>> >> >> >>> "log warn"
>> >> >> >>>
>> >> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
>> >> >> >>>
>> >> >>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> >> -
>> >> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >> >> >>> java.util.ConcurrentModificationException
>> >> >> >>>         at
>> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >> >> >>>         at
>> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >> >> >>>         at
>> >> >> >>>
>> >> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >> >> >>>         at
>> >> >> >>>
>> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >> >> >>>         at
>> >> >> >>>
>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >> >> >>>         at
>> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >> >> >>>         at
>> >> >> >>>
>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >> >> >>>         at
>> >> >> >>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> (I will investigate also in my code to check if the problem does
>> not
>> >> >> come
>> >> >> >>> from me ?)
>> >> >> >>>
>> >> >> >>> cheers;
>> >> >> >>> /Pierre
>> >> >> >>>
>> >> >> >>>
>> >> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
>> >> pierre.derop@gmail.com
>> >> >> >
>> >> >> >>> wrote:
>> >> >> >>>
>> >> >> >>>> Hi David,
>> >> >> >>>>
>> >> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if
>> there
>> >> >> is a
>> >> >> >>>> regression somewhere in the framework, by my parallel test does
>> not
>> >> >> pass
>> >> >> >>>> anymore.
>> >> >> >>>>
>> >> >> >>>> The test first starts with a single-threaded scenario, which
>> >> passes OK
>> >> >> >>>>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
>> >> >> then when
>> >> >> >>>> the parallel test starts
>> >> >> >>>>
>> >> >>
>> >>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>> >> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
>> >> shell, I
>> >> >> see
>> >> >> >>>> the following exception:
>> >> >> >>>>
>> >> >> >>>> (I'm using java8):
>> >> >> >>>>
>> >> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>> >> >> >>>> ____________________________
>> >> >> >>>> Welcome to Apache Felix Gogo
>> >> >> >>>>
>> >> >> >>>> Benchmarking bundle:
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> >> .
>> >> >> >>>>
>> >> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
>> >> "log
>> >> >> >>>> warn", I see this:)
>> >> >> >>>>
>> >> >> >>>> g! log warn
>> >> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> >> -
>> >> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >> >> >>>> java.util.ConcurrentModificationException
>> >> >> >>>>         at
>> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >> >> >>>>         at
>> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >> >> >>>>         at
>> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >> >> >>>>         at
>> >> >> >>>>
>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >> >> >>>>         at
>> >> >> >>>>
>> >> >>
>> >>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >> >> >>>>
>> >> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
>> >> >> >>>> threadpool=4, then I have the problem)
>> >> >> >>>>
>> >> >> >>>> I will investigate, but Ideally, may be it would be helpful if
>> you
>> >> >> could
>> >> >> >>>> also run the test by yourself; so I will commit soon something
>> to
>> >> >> reproduce
>> >> >> >>>> the problem in my sandbox.
>> >> >> >>>>
>> >> >> >>>> cheers;
>> >> >> >>>> /Pierre
>> >> >> >>>>
>> >> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>> >> >> >>>> david.bosschaert@gmail.com> wrote:
>> >> >> >>>>
>> >> >> >>>>> I've committed this now in
>> >> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>> >> >> >>>>>
>> >> >> >>>>> Curious to see what others are measuring. My tests were
>> focused on
>> >> >> >>>>> multiple bundles/threads obtaining the same service, as that's
>> >> were I
>> >> >> >>>>> saw a bit of contention.
>> >> >> >>>>>
>> >> >> >>>>> Cheers,
>> >> >> >>>>>
>> >> >> >>>>> David
>> >> >> >>>>>
>> >> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pierre.derop@gmail.com
>> >
>> >> >> wrote:
>> >> >> >>>>> > Hi David,
>> >> >> >>>>> >
>> >> >> >>>>> > I'm looking forward to test your improvements using the
>> >> >> >>>>> dependencymanager
>> >> >> >>>>> > benchmark tool ([1]).
>> >> >> >>>>> >
>> >> >> >>>>> >
>> >> >> >>>>> > [1]
>> >> >> >>>>> >
>> >> >> >>>>>
>> >> >>
>> >>
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>> >> >> >>>>> >
>> >> >> >>>>> > /Pierre
>> >> >> >>>>> >
>> >> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>> >> >> >>>>> > david.bosschaert@gmail.com> wrote:
>> >> >> >>>>> >
>> >> >> >>>>> >> I have implemented the performance improvements that I was
>> >> >> thinking of
>> >> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>> >> >> >>>>> >>
>> >> >> >>>>> >> I wrote a little performance test suite [2] that tests
>> >> >> multithreaded
>> >> >> >>>>> >> service registry performance (10 threads) from single /
>> >> multiple
>> >> >> >>>>> >> bundles with either singleton services and Prototype Service
>> >> >> Factory
>> >> >> >>>>> >> services and the results are quite impressive. I'm getting
>> >> >> performance
>> >> >> >>>>> >> improvements compared to the current trunk from 8 times
>> better
>> >> >> than
>> >> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
>> >> >> >>>>> >>
>> >> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and
>> I'm
>> >> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>> >> >> >>>>> >>
>> >> >> >>>>> >> Cheers,
>> >> >> >>>>> >>
>> >> >> >>>>> >> David
>> >> >> >>>>> >>
>> >> >> >>>>> >> [1]
>> >> >> >>>>> >>
>> >> >> >>>>>
>> >> >>
>> >>
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> >> >> >>>>> >> [2]
>> >> >> >>>>> >>
>> >> >> >>>>>
>> >> >>
>> >>
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>> >> >> >>>>> >>
>> >> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
>> >> heavy@ungoverned.org>
>> >> >> >>>>> wrote:
>> >> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >> >> >>>>> >> >>
>> >> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
>> >> >> heavy@ungoverned.org>
>> >> >> >>>>> >> wrote:
>> >> >> >>>>> >> >>>
>> >> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >> >> >>>>> >> >>>>
>> >> >> >>>>> >> >>>> There's a call to interrupt() in
>> >> Felix#acquireBundleLock(),
>> >> >> not
>> >> >> >>>>> sure
>> >> >> >>>>> >> if
>> >> >> >>>>> >> >>>> it
>> >> >> >>>>> >> >>>> can be the culprit though.
>> >> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being
>> shutdown
>> >> >> while
>> >> >> >>>>> one
>> >> >> >>>>> >> of
>> >> >> >>>>> >> >>>> its
>> >> >> >>>>> >> >>>> thread is waiting for a service, which should is a
>> valid
>> >> use
>> >> >> case
>> >> >> >>>>> >> imho.
>> >> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
>> >> interrupted
>> >> >> >>>>> would be
>> >> >> >>>>> >> >>>> good.
>> >> >> >>>>> >> >>>
>> >> >> >>>>> >> >>>
>> >> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
>> >> bundle
>> >> >> lock
>> >> >> >>>>> and
>> >> >> >>>>> >> the
>> >> >> >>>>> >> >>> global lock holder needs the bundle lock.
>> >> >> >>>>> >> >>>
>> >> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
>> >> >> here, but
>> >> >> >>>>> >> didn't
>> >> >> >>>>> >> >>> we
>> >> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
>> >> >> necessary? I
>> >> >> >>>>> >> >>> thought
>> >> >> >>>>> >> >>> we just checked for the validity of the bundle context
>> >> before
>> >> >> >>>>> returning
>> >> >> >>>>> >> >>> or
>> >> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
>> >> >> interrupted in
>> >> >> >>>>> >> that
>> >> >> >>>>> >> >>> case. I really don't know.
>> >> >> >>>>> >> >>
>> >> >> >>>>> >> >> I think that the Service Registry could be rewritten to
>> be
>> >> >> >>>>> completely
>> >> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
>> >> >> libraries,
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >
>> >> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
>> >> yeah
>> >> >> sure.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >> which I think would really be a better approach. There is
>> >> too
>> >> >> much
>> >> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >
>> >> >> >>>>> >> > I don't really think there is too much, but it is
>> >> complicated.
>> >> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
>> >> aren't
>> >> >> held
>> >> >> >>>>> >> while
>> >> >> >>>>> >> > do service lookups and this is complicated because you can
>> >> run
>> >> >> into
>> >> >> >>>>> >> cycles,
>> >> >> >>>>> >> > etc.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> > But feel free to try to simplify it.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >>
>> >> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java
>> 6)
>> >> >> for the
>> >> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
>> >> >> compatible
>> >> >> >>>>> but
>> >> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
>> >> JDK
>> >> >> that
>> >> >> >>>>> >> >> went end-of-life 7 years ago.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >
>> >> >> >>>>> >> > At this point, it doesn't really matter to me.
>> >> >> >>>>> >> >
>> >> >> >>>>> >> > -> richard
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >>
>> >> >> >>>>> >> >> Best regards,
>> >> >> >>>>> >> >>
>> >> >> >>>>> >> >> David
>> >> >> >>>>> >> >
>> >> >> >>>>> >> >
>> >> >> >>>>> >>
>> >> >> >>>>>
>> >> >> >>>>
>> >> >> >>>>
>> >> >>
>> >>
>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
Hi David,

Excellent.

I'm glad to confirm that the issue is resolved, and my DM loader is now
running seamlessly.
I'm observing an overall gain of 16% compared to the previous 5.0.0.
(but this has to be taken with care,because I only made a quick test).

I did not have time but I guess I could observe a better performance gain
on a bigger host with more cpu (I only have four); since synchronization
cost is usually proportional to the number of available cores and as I
understand your fix is now based on java.util.concurrent jdk tools.


many thanks
/Pierre

On Tue, May 19, 2015 at 1:57 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> Thanks Pierre for submitting a unit test to FELIX-4866 that helped me
> enormously in identifying the issue.
>
> I have fixed the bug in my code (without degrading performance) and at
> least your concurrency test, my concurrency tests and all the
> framework unit tests now consistently pass. I would be very interested
> in hearing whether your bigger test suit also still behaves as
> expected.
>
> Best regards,
>
> David
>
> On 14 May 2015 at 22:53, Pierre De Rop <pi...@gmail.com> wrote:
> > the threadump did not help.
> > I will  investigate (may be a bug somewhere in my part; if this is the
> > case, I would be sorry to make all this noise).
> >
> > hope to let you know soon.
> >
> > by the way, do you know how to run the SCR integration tests with the
> > framework from the trunk ? I know that there are some SCR integration
> tests
> > that are doing some load tests, and I would be interested to know if they
> > are also ok with the framework from the trunk ?
> >
> > cheers;
> > /Pierre
> >
> >
> > On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> Hi Pierre,
> >>
> >> It would indeed be useful to find out more about why your test is
> >> hanging. Maybe analysing a threaddump might give some more
> >> information?
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
> >> > Thanks David; I just gave a try, and indeed the parallel test passed.
> I
> >> > observed a gain of around 7/10%. The tool is described in [1].
> >> >
> >> > But I only have 4 cores on my laptop and I will make more tests in my
> lab
> >> > at work (next week) where we have some servers having 32 or even 128
> >> > processors. This will give a better idea of the gain because the more
> >> > processor you have, the more synchronization is costly, so I could
> >> possibly
> >> > observe a better performance gain.
> >> >
> >> > Now, I'm sorry but I think that there is still a problem (I don't know
> >> > where): when using more threads, the parallel test does not complete
> and
> >> > stops with a timeout message, indicating that the number of expected
> >> > components are not created after a timeout delay of 1 minute.
> >> >
> >> > So, I just committed a modified version of the tool in the sandbox
> which
> >> > can now take a -Dthreads option in order to configure the number of
> >> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test
> does
> >> > not complete and ends with a timeout:
> >> >
> >> > $ java -Dthreads=10 -server -jar bin/felix.jar
> >> >
> >> > g! Starting benchmarks (each tested bundle will add/remove 630
> components
> >> > during bundle activation).
> >> >
> >> >         [Starting benchmarks with no processing done in components
> start
> >> > methods]
> >> >
> >> > Benchmarking bundle:
> >> >
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> > .................................................Could not start
> >> components
> >> > timely: current start latch=2, stop latch=630
> >> >
> >> > My current understanding of this is that some components are still
> >> awaiting
> >> > for unsatisfied service dependencies, just like if a service tracker
> >> would
> >> > have missed a service registration.
> >> >
> >> > I ran the same test during two hours with the previous framework
> version,
> >> > and did not observe any problems.
> >> >
> >> > I wonder if someone else do have another tool in order to perform
> another
> >> > kind of load test, just to see if some problems are also observed.
> >> >
> >> > -> from  my side, I will do the following: in the past, the benchmark
> >> tool
> >> > supported not only dependencymanager, but also Felix SCR and iPojo.
> So, I
> >> > will reintroduce Felix SCR in the benchmark and will check if I also
> >> > observe the problem (with -Dthreads=10).
> >> >
> >> > I will let you know.
> >> >
> >> > cheers;
> >> > /Pierre
> >> >
> >> > [1]
> >> >
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
> >> >
> >> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> >> > david.bosschaert@gmail.com> wrote:
> >> >
> >> >> I've fixed this now in
> >> >> svn.apache.org/viewvc?view=revision&revision=1679367
> >> >>
> >> >> Pierre, your loadtest now runs to completion - thanks for reporting
> >> >> this issue! I can see that the results for the parallel tests are a
> >> >> little bit different than before, but I'm not sure how to read them
> so
> >> >> I'll leave the interpretation of that to you :)
> >> >>
> >> >> Cheers,
> >> >>
> >> >> David
> >> >>
> >> >> On 14 May 2015 at 14:38, David Bosschaert <
> david.bosschaert@gmail.com>
> >> >> wrote:
> >> >> > I think I know what this is. I had some additional changes exactly
> in
> >> >> > this area that I simply forgot to apply this morning. I should
> have it
> >> >> > fixed sometime today.
> >> >> >
> >> >> > Cheers,
> >> >> >
> >> >> > David
> >> >> >
> >> >> > On 14 May 2015 at 14:03, David Bosschaert <
> david.bosschaert@gmail.com
> >> >
> >> >> wrote:
> >> >> >> Hi Pierre,
> >> >> >>
> >> >> >> I'll take a look today.
> >> >> >>
> >> >> >> Cheers,
> >> >> >>
> >> >> >> David
> >> >> >>
> >> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
> >> wrote:
> >> >> >>> I just committed the benchmark tool in
> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if
> you
> >> >> can
> >> >> >>> take a look.
> >> >> >>>
> >> >> >>> To run the scenario:
> >> >> >>>
> >> >> >>> - install jdk8:
> >> >> >>>
> >> >> >>> [nxuser@nx0012 pderop]$ java -version
> >> >> >>> java version "1.8.0_40"
> >> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >> >> >>>
> >> >> >>> - checkout the loadtest from
> >> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >> >> >>>
> >> >> >>> - go the the "loadtest" directory and start the test, just like
> >> this:
> >> >> >>>
> >> >> >>> $ java -server -jar bin/felix.jar
> >> >> >>> Welcome to Apache Felix Gogo
> >> >> >>>
> >> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> >> >> components
> >> >> >>> during bundle activation).
> >> >> >>>
> >> >> >>>         [Starting benchmarks with no processing done in
> components
> >> >> start
> >> >> >>> methods]
> >> >> >>>
> >> >> >>> Benchmarking bundle:
> >> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >> >> >>> ..................................................
> >> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> >> >> 319,631,722
> >> >> >>> | 919,838,078]
> >> >> >>>
> >> >> >>> Benchmarking bundle:
> >> >> >>>
> >> >>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> .
> >> >> >>>
> >> >> >>>
> >> >> >>> Here, the first
> >> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
> >> test
> >> >> >>> (single-threaded) passes OK. But the next one hangs
> >> >> >>>
> >> >>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >> >> >>> it uses a fork join pool with size=4.
> >> >> >>>
> >> >> >>> and when typing "log warn", we see:
> >> >> >>>
> >> >> >>> "log warn"
> >> >> >>>
> >> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >> >> >>>
> >> >>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> -
> >> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >> >>> java.util.ConcurrentModificationException
> >> >> >>>         at
> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >> >>>         at
> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >> >>>         at
> >> >> >>>
> >> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >> >>>         at
> >> >> >>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >> >>>         at
> >> >> >>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >> >>>         at
> >> >> >>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >> >>>         at
> >> >> >>>
> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >> >>>         at
> >> >> >>>
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >> >>>         at
> >> >> >>>
> >> >>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >> >>>         at
> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >> >>>         at
> >> >> >>>
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >> >>>         at
> >> >> >>>
> >> >>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >> >>>
> >> >> >>>
> >> >> >>> (I will investigate also in my code to check if the problem does
> not
> >> >> come
> >> >> >>> from me ?)
> >> >> >>>
> >> >> >>> cheers;
> >> >> >>> /Pierre
> >> >> >>>
> >> >> >>>
> >> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
> >> pierre.derop@gmail.com
> >> >> >
> >> >> >>> wrote:
> >> >> >>>
> >> >> >>>> Hi David,
> >> >> >>>>
> >> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if
> there
> >> >> is a
> >> >> >>>> regression somewhere in the framework, by my parallel test does
> not
> >> >> pass
> >> >> >>>> anymore.
> >> >> >>>>
> >> >> >>>> The test first starts with a single-threaded scenario, which
> >> passes OK
> >> >> >>>>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> >> >> then when
> >> >> >>>> the parallel test starts
> >> >> >>>>
> >> >>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
> >> shell, I
> >> >> see
> >> >> >>>> the following exception:
> >> >> >>>>
> >> >> >>>> (I'm using java8):
> >> >> >>>>
> >> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >> >> >>>> ____________________________
> >> >> >>>> Welcome to Apache Felix Gogo
> >> >> >>>>
> >> >> >>>> Benchmarking bundle:
> >> >> >>>>
> >> >>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> .
> >> >> >>>>
> >> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
> >> "log
> >> >> >>>> warn", I see this:)
> >> >> >>>>
> >> >> >>>> g! log warn
> >> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >> >> >>>>
> >> >>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> >> -
> >> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >> >>>> java.util.ConcurrentModificationException
> >> >> >>>>         at
> >> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >> >>>>         at
> >> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >> >>>>         at
> >> >> >>>>
> >> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >> >>>>         at
> >> >> >>>>
> >> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >> >>>>         at
> >> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >> >>>>         at
> >> >> >>>>
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >> >>>>         at
> >> >> >>>>
> >> >>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >> >>>>
> >> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >> >> >>>> threadpool=4, then I have the problem)
> >> >> >>>>
> >> >> >>>> I will investigate, but Ideally, may be it would be helpful if
> you
> >> >> could
> >> >> >>>> also run the test by yourself; so I will commit soon something
> to
> >> >> reproduce
> >> >> >>>> the problem in my sandbox.
> >> >> >>>>
> >> >> >>>> cheers;
> >> >> >>>> /Pierre
> >> >> >>>>
> >> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >> >> >>>> david.bosschaert@gmail.com> wrote:
> >> >> >>>>
> >> >> >>>>> I've committed this now in
> >> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >> >> >>>>>
> >> >> >>>>> Curious to see what others are measuring. My tests were
> focused on
> >> >> >>>>> multiple bundles/threads obtaining the same service, as that's
> >> were I
> >> >> >>>>> saw a bit of contention.
> >> >> >>>>>
> >> >> >>>>> Cheers,
> >> >> >>>>>
> >> >> >>>>> David
> >> >> >>>>>
> >> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pierre.derop@gmail.com
> >
> >> >> wrote:
> >> >> >>>>> > Hi David,
> >> >> >>>>> >
> >> >> >>>>> > I'm looking forward to test your improvements using the
> >> >> >>>>> dependencymanager
> >> >> >>>>> > benchmark tool ([1]).
> >> >> >>>>> >
> >> >> >>>>> >
> >> >> >>>>> > [1]
> >> >> >>>>> >
> >> >> >>>>>
> >> >>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >> >> >>>>> >
> >> >> >>>>> > /Pierre
> >> >> >>>>> >
> >> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >> >> >>>>> > david.bosschaert@gmail.com> wrote:
> >> >> >>>>> >
> >> >> >>>>> >> I have implemented the performance improvements that I was
> >> >> thinking of
> >> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >> >> >>>>> >>
> >> >> >>>>> >> I wrote a little performance test suite [2] that tests
> >> >> multithreaded
> >> >> >>>>> >> service registry performance (10 threads) from single /
> >> multiple
> >> >> >>>>> >> bundles with either singleton services and Prototype Service
> >> >> Factory
> >> >> >>>>> >> services and the results are quite impressive. I'm getting
> >> >> performance
> >> >> >>>>> >> improvements compared to the current trunk from 8 times
> better
> >> >> than
> >> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >> >> >>>>> >>
> >> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and
> I'm
> >> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >> >> >>>>> >>
> >> >> >>>>> >> Cheers,
> >> >> >>>>> >>
> >> >> >>>>> >> David
> >> >> >>>>> >>
> >> >> >>>>> >> [1]
> >> >> >>>>> >>
> >> >> >>>>>
> >> >>
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> >> >>>>> >> [2]
> >> >> >>>>> >>
> >> >> >>>>>
> >> >>
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >> >> >>>>> >>
> >> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
> >> heavy@ungoverned.org>
> >> >> >>>>> wrote:
> >> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >> >>>>> >> >>
> >> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> >> >> heavy@ungoverned.org>
> >> >> >>>>> >> wrote:
> >> >> >>>>> >> >>>
> >> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >> >>>>> >> >>>>
> >> >> >>>>> >> >>>> There's a call to interrupt() in
> >> Felix#acquireBundleLock(),
> >> >> not
> >> >> >>>>> sure
> >> >> >>>>> >> if
> >> >> >>>>> >> >>>> it
> >> >> >>>>> >> >>>> can be the culprit though.
> >> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being
> shutdown
> >> >> while
> >> >> >>>>> one
> >> >> >>>>> >> of
> >> >> >>>>> >> >>>> its
> >> >> >>>>> >> >>>> thread is waiting for a service, which should is a
> valid
> >> use
> >> >> case
> >> >> >>>>> >> imho.
> >> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
> >> interrupted
> >> >> >>>>> would be
> >> >> >>>>> >> >>>> good.
> >> >> >>>>> >> >>>
> >> >> >>>>> >> >>>
> >> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
> >> bundle
> >> >> lock
> >> >> >>>>> and
> >> >> >>>>> >> the
> >> >> >>>>> >> >>> global lock holder needs the bundle lock.
> >> >> >>>>> >> >>>
> >> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
> >> >> here, but
> >> >> >>>>> >> didn't
> >> >> >>>>> >> >>> we
> >> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> >> >> necessary? I
> >> >> >>>>> >> >>> thought
> >> >> >>>>> >> >>> we just checked for the validity of the bundle context
> >> before
> >> >> >>>>> returning
> >> >> >>>>> >> >>> or
> >> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> >> >> interrupted in
> >> >> >>>>> >> that
> >> >> >>>>> >> >>> case. I really don't know.
> >> >> >>>>> >> >>
> >> >> >>>>> >> >> I think that the Service Registry could be rewritten to
> be
> >> >> >>>>> completely
> >> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> >> >> libraries,
> >> >> >>>>> >> >
> >> >> >>>>> >> >
> >> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
> >> yeah
> >> >> sure.
> >> >> >>>>> >> >
> >> >> >>>>> >> >> which I think would really be a better approach. There is
> >> too
> >> >> much
> >> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
> >> >> >>>>> >> >
> >> >> >>>>> >> >
> >> >> >>>>> >> > I don't really think there is too much, but it is
> >> complicated.
> >> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
> >> aren't
> >> >> held
> >> >> >>>>> >> while
> >> >> >>>>> >> > do service lookups and this is complicated because you can
> >> run
> >> >> into
> >> >> >>>>> >> cycles,
> >> >> >>>>> >> > etc.
> >> >> >>>>> >> >
> >> >> >>>>> >> > But feel free to try to simplify it.
> >> >> >>>>> >> >
> >> >> >>>>> >> >>
> >> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java
> 6)
> >> >> for the
> >> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
> >> >> compatible
> >> >> >>>>> but
> >> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
> >> JDK
> >> >> that
> >> >> >>>>> >> >> went end-of-life 7 years ago.
> >> >> >>>>> >> >
> >> >> >>>>> >> >
> >> >> >>>>> >> > At this point, it doesn't really matter to me.
> >> >> >>>>> >> >
> >> >> >>>>> >> > -> richard
> >> >> >>>>> >> >
> >> >> >>>>> >> >>
> >> >> >>>>> >> >> Best regards,
> >> >> >>>>> >> >>
> >> >> >>>>> >> >> David
> >> >> >>>>> >> >
> >> >> >>>>> >> >
> >> >> >>>>> >>
> >> >> >>>>>
> >> >> >>>>
> >> >> >>>>
> >> >>
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
Thanks Pierre for submitting a unit test to FELIX-4866 that helped me
enormously in identifying the issue.

I have fixed the bug in my code (without degrading performance) and at
least your concurrency test, my concurrency tests and all the
framework unit tests now consistently pass. I would be very interested
in hearing whether your bigger test suit also still behaves as
expected.

Best regards,

David

On 14 May 2015 at 22:53, Pierre De Rop <pi...@gmail.com> wrote:
> the threadump did not help.
> I will  investigate (may be a bug somewhere in my part; if this is the
> case, I would be sorry to make all this noise).
>
> hope to let you know soon.
>
> by the way, do you know how to run the SCR integration tests with the
> framework from the trunk ? I know that there are some SCR integration tests
> that are doing some load tests, and I would be interested to know if they
> are also ok with the framework from the trunk ?
>
> cheers;
> /Pierre
>
>
> On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> Hi Pierre,
>>
>> It would indeed be useful to find out more about why your test is
>> hanging. Maybe analysing a threaddump might give some more
>> information?
>>
>> Cheers,
>>
>> David
>>
>> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
>> > Thanks David; I just gave a try, and indeed the parallel test passed. I
>> > observed a gain of around 7/10%. The tool is described in [1].
>> >
>> > But I only have 4 cores on my laptop and I will make more tests in my lab
>> > at work (next week) where we have some servers having 32 or even 128
>> > processors. This will give a better idea of the gain because the more
>> > processor you have, the more synchronization is costly, so I could
>> possibly
>> > observe a better performance gain.
>> >
>> > Now, I'm sorry but I think that there is still a problem (I don't know
>> > where): when using more threads, the parallel test does not complete and
>> > stops with a timeout message, indicating that the number of expected
>> > components are not created after a timeout delay of 1 minute.
>> >
>> > So, I just committed a modified version of the tool in the sandbox which
>> > can now take a -Dthreads option in order to configure the number of
>> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
>> > not complete and ends with a timeout:
>> >
>> > $ java -Dthreads=10 -server -jar bin/felix.jar
>> >
>> > g! Starting benchmarks (each tested bundle will add/remove 630 components
>> > during bundle activation).
>> >
>> >         [Starting benchmarks with no processing done in components start
>> > methods]
>> >
>> > Benchmarking bundle:
>> > org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> > .................................................Could not start
>> components
>> > timely: current start latch=2, stop latch=630
>> >
>> > My current understanding of this is that some components are still
>> awaiting
>> > for unsatisfied service dependencies, just like if a service tracker
>> would
>> > have missed a service registration.
>> >
>> > I ran the same test during two hours with the previous framework version,
>> > and did not observe any problems.
>> >
>> > I wonder if someone else do have another tool in order to perform another
>> > kind of load test, just to see if some problems are also observed.
>> >
>> > -> from  my side, I will do the following: in the past, the benchmark
>> tool
>> > supported not only dependencymanager, but also Felix SCR and iPojo. So, I
>> > will reintroduce Felix SCR in the benchmark and will check if I also
>> > observe the problem (with -Dthreads=10).
>> >
>> > I will let you know.
>> >
>> > cheers;
>> > /Pierre
>> >
>> > [1]
>> >
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
>> >
>> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
>> > david.bosschaert@gmail.com> wrote:
>> >
>> >> I've fixed this now in
>> >> svn.apache.org/viewvc?view=revision&revision=1679367
>> >>
>> >> Pierre, your loadtest now runs to completion - thanks for reporting
>> >> this issue! I can see that the results for the parallel tests are a
>> >> little bit different than before, but I'm not sure how to read them so
>> >> I'll leave the interpretation of that to you :)
>> >>
>> >> Cheers,
>> >>
>> >> David
>> >>
>> >> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
>> >> wrote:
>> >> > I think I know what this is. I had some additional changes exactly in
>> >> > this area that I simply forgot to apply this morning. I should have it
>> >> > fixed sometime today.
>> >> >
>> >> > Cheers,
>> >> >
>> >> > David
>> >> >
>> >> > On 14 May 2015 at 14:03, David Bosschaert <david.bosschaert@gmail.com
>> >
>> >> wrote:
>> >> >> Hi Pierre,
>> >> >>
>> >> >> I'll take a look today.
>> >> >>
>> >> >> Cheers,
>> >> >>
>> >> >> David
>> >> >>
>> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
>> wrote:
>> >> >>> I just committed the benchmark tool in
>> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
>> >> can
>> >> >>> take a look.
>> >> >>>
>> >> >>> To run the scenario:
>> >> >>>
>> >> >>> - install jdk8:
>> >> >>>
>> >> >>> [nxuser@nx0012 pderop]$ java -version
>> >> >>> java version "1.8.0_40"
>> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>> >> >>>
>> >> >>> - checkout the loadtest from
>> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>> >> >>>
>> >> >>> - go the the "loadtest" directory and start the test, just like
>> this:
>> >> >>>
>> >> >>> $ java -server -jar bin/felix.jar
>> >> >>> Welcome to Apache Felix Gogo
>> >> >>>
>> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
>> >> components
>> >> >>> during bundle activation).
>> >> >>>
>> >> >>>         [Starting benchmarks with no processing done in components
>> >> start
>> >> >>> methods]
>> >> >>>
>> >> >>> Benchmarking bundle:
>> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>> >> >>> ..................................................
>> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
>> >> 319,631,722
>> >> >>> | 919,838,078]
>> >> >>>
>> >> >>> Benchmarking bundle:
>> >> >>>
>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> .
>> >> >>>
>> >> >>>
>> >> >>> Here, the first
>> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
>> test
>> >> >>> (single-threaded) passes OK. But the next one hangs
>> >> >>>
>> >>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>> >> >>> it uses a fork join pool with size=4.
>> >> >>>
>> >> >>> and when typing "log warn", we see:
>> >> >>>
>> >> >>> "log warn"
>> >> >>>
>> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
>> >> >>>
>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> -
>> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >> >>> java.util.ConcurrentModificationException
>> >> >>>         at
>> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >> >>>         at
>> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >> >>>         at
>> >> >>>
>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >> >>>         at
>> >> >>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >> >>>         at
>> >> >>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >> >>>         at
>> >> >>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >> >>>         at
>> >> >>>
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >> >>>         at
>> >> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >> >>>         at
>> >> >>>
>> >>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >> >>>         at
>> >> >>>
>> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >> >>>         at
>> >> >>>
>> >>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >> >>>         at
>> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >> >>>         at
>> >> >>>
>> >>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >> >>>         at
>> >> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >> >>>         at
>> >> >>>
>> >>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >> >>>
>> >> >>>
>> >> >>> (I will investigate also in my code to check if the problem does not
>> >> come
>> >> >>> from me ?)
>> >> >>>
>> >> >>> cheers;
>> >> >>> /Pierre
>> >> >>>
>> >> >>>
>> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
>> pierre.derop@gmail.com
>> >> >
>> >> >>> wrote:
>> >> >>>
>> >> >>>> Hi David,
>> >> >>>>
>> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
>> >> is a
>> >> >>>> regression somewhere in the framework, by my parallel test does not
>> >> pass
>> >> >>>> anymore.
>> >> >>>>
>> >> >>>> The test first starts with a single-threaded scenario, which
>> passes OK
>> >> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
>> >> then when
>> >> >>>> the parallel test starts
>> >> >>>>
>> >>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
>> shell, I
>> >> see
>> >> >>>> the following exception:
>> >> >>>>
>> >> >>>> (I'm using java8):
>> >> >>>>
>> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>> >> >>>> ____________________________
>> >> >>>> Welcome to Apache Felix Gogo
>> >> >>>>
>> >> >>>> Benchmarking bundle:
>> >> >>>>
>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> .
>> >> >>>>
>> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
>> "log
>> >> >>>> warn", I see this:)
>> >> >>>>
>> >> >>>> g! log warn
>> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
>> >> >>>>
>> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
>> -
>> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >> >>>> java.util.ConcurrentModificationException
>> >> >>>>         at
>> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >> >>>>         at
>> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >> >>>>         at
>> >> >>>>
>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >> >>>>         at
>> >> >>>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >> >>>>         at
>> >> >>>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >> >>>>         at
>> >> >>>>
>> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >> >>>>         at
>> >> >>>>
>> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >> >>>>         at
>> >> >>>>
>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >> >>>>         at
>> >> >>>>
>> >>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >> >>>>         at
>> >> >>>>
>> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >> >>>>         at
>> >> >>>>
>> >>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >> >>>>         at
>> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >> >>>>         at
>> >> >>>>
>> >>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >> >>>>         at
>> >> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >> >>>>         at
>> >> >>>>
>> >>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >> >>>>
>> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
>> >> >>>> threadpool=4, then I have the problem)
>> >> >>>>
>> >> >>>> I will investigate, but Ideally, may be it would be helpful if you
>> >> could
>> >> >>>> also run the test by yourself; so I will commit soon something to
>> >> reproduce
>> >> >>>> the problem in my sandbox.
>> >> >>>>
>> >> >>>> cheers;
>> >> >>>> /Pierre
>> >> >>>>
>> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>> >> >>>> david.bosschaert@gmail.com> wrote:
>> >> >>>>
>> >> >>>>> I've committed this now in
>> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>> >> >>>>>
>> >> >>>>> Curious to see what others are measuring. My tests were focused on
>> >> >>>>> multiple bundles/threads obtaining the same service, as that's
>> were I
>> >> >>>>> saw a bit of contention.
>> >> >>>>>
>> >> >>>>> Cheers,
>> >> >>>>>
>> >> >>>>> David
>> >> >>>>>
>> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
>> >> wrote:
>> >> >>>>> > Hi David,
>> >> >>>>> >
>> >> >>>>> > I'm looking forward to test your improvements using the
>> >> >>>>> dependencymanager
>> >> >>>>> > benchmark tool ([1]).
>> >> >>>>> >
>> >> >>>>> >
>> >> >>>>> > [1]
>> >> >>>>> >
>> >> >>>>>
>> >>
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>> >> >>>>> >
>> >> >>>>> > /Pierre
>> >> >>>>> >
>> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>> >> >>>>> > david.bosschaert@gmail.com> wrote:
>> >> >>>>> >
>> >> >>>>> >> I have implemented the performance improvements that I was
>> >> thinking of
>> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>> >> >>>>> >>
>> >> >>>>> >> I wrote a little performance test suite [2] that tests
>> >> multithreaded
>> >> >>>>> >> service registry performance (10 threads) from single /
>> multiple
>> >> >>>>> >> bundles with either singleton services and Prototype Service
>> >> Factory
>> >> >>>>> >> services and the results are quite impressive. I'm getting
>> >> performance
>> >> >>>>> >> improvements compared to the current trunk from 8 times better
>> >> than
>> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
>> >> >>>>> >>
>> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>> >> >>>>> >>
>> >> >>>>> >> Cheers,
>> >> >>>>> >>
>> >> >>>>> >> David
>> >> >>>>> >>
>> >> >>>>> >> [1]
>> >> >>>>> >>
>> >> >>>>>
>> >>
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> >> >>>>> >> [2]
>> >> >>>>> >>
>> >> >>>>>
>> >>
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>> >> >>>>> >>
>> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
>> heavy@ungoverned.org>
>> >> >>>>> wrote:
>> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >> >>>>> >> >>
>> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
>> >> heavy@ungoverned.org>
>> >> >>>>> >> wrote:
>> >> >>>>> >> >>>
>> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >> >>>>> >> >>>>
>> >> >>>>> >> >>>> There's a call to interrupt() in
>> Felix#acquireBundleLock(),
>> >> not
>> >> >>>>> sure
>> >> >>>>> >> if
>> >> >>>>> >> >>>> it
>> >> >>>>> >> >>>> can be the culprit though.
>> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
>> >> while
>> >> >>>>> one
>> >> >>>>> >> of
>> >> >>>>> >> >>>> its
>> >> >>>>> >> >>>> thread is waiting for a service, which should is a valid
>> use
>> >> case
>> >> >>>>> >> imho.
>> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
>> interrupted
>> >> >>>>> would be
>> >> >>>>> >> >>>> good.
>> >> >>>>> >> >>>
>> >> >>>>> >> >>>
>> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
>> bundle
>> >> lock
>> >> >>>>> and
>> >> >>>>> >> the
>> >> >>>>> >> >>> global lock holder needs the bundle lock.
>> >> >>>>> >> >>>
>> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
>> >> here, but
>> >> >>>>> >> didn't
>> >> >>>>> >> >>> we
>> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
>> >> necessary? I
>> >> >>>>> >> >>> thought
>> >> >>>>> >> >>> we just checked for the validity of the bundle context
>> before
>> >> >>>>> returning
>> >> >>>>> >> >>> or
>> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
>> >> interrupted in
>> >> >>>>> >> that
>> >> >>>>> >> >>> case. I really don't know.
>> >> >>>>> >> >>
>> >> >>>>> >> >> I think that the Service Registry could be rewritten to be
>> >> >>>>> completely
>> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
>> >> libraries,
>> >> >>>>> >> >
>> >> >>>>> >> >
>> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
>> yeah
>> >> sure.
>> >> >>>>> >> >
>> >> >>>>> >> >> which I think would really be a better approach. There is
>> too
>> >> much
>> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
>> >> >>>>> >> >
>> >> >>>>> >> >
>> >> >>>>> >> > I don't really think there is too much, but it is
>> complicated.
>> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
>> aren't
>> >> held
>> >> >>>>> >> while
>> >> >>>>> >> > do service lookups and this is complicated because you can
>> run
>> >> into
>> >> >>>>> >> cycles,
>> >> >>>>> >> > etc.
>> >> >>>>> >> >
>> >> >>>>> >> > But feel free to try to simplify it.
>> >> >>>>> >> >
>> >> >>>>> >> >>
>> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
>> >> for the
>> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
>> >> compatible
>> >> >>>>> but
>> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
>> JDK
>> >> that
>> >> >>>>> >> >> went end-of-life 7 years ago.
>> >> >>>>> >> >
>> >> >>>>> >> >
>> >> >>>>> >> > At this point, it doesn't really matter to me.
>> >> >>>>> >> >
>> >> >>>>> >> > -> richard
>> >> >>>>> >> >
>> >> >>>>> >> >>
>> >> >>>>> >> >> Best regards,
>> >> >>>>> >> >>
>> >> >>>>> >> >> David
>> >> >>>>> >> >
>> >> >>>>> >> >
>> >> >>>>> >>
>> >> >>>>>
>> >> >>>>
>> >> >>>>
>> >>
>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
the threadump did not help.
I will  investigate (may be a bug somewhere in my part; if this is the
case, I would be sorry to make all this noise).

hope to let you know soon.

by the way, do you know how to run the SCR integration tests with the
framework from the trunk ? I know that there are some SCR integration tests
that are doing some load tests, and I would be interested to know if they
are also ok with the framework from the trunk ?

cheers;
/Pierre


On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> Hi Pierre,
>
> It would indeed be useful to find out more about why your test is
> hanging. Maybe analysing a threaddump might give some more
> information?
>
> Cheers,
>
> David
>
> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
> > Thanks David; I just gave a try, and indeed the parallel test passed. I
> > observed a gain of around 7/10%. The tool is described in [1].
> >
> > But I only have 4 cores on my laptop and I will make more tests in my lab
> > at work (next week) where we have some servers having 32 or even 128
> > processors. This will give a better idea of the gain because the more
> > processor you have, the more synchronization is costly, so I could
> possibly
> > observe a better performance gain.
> >
> > Now, I'm sorry but I think that there is still a problem (I don't know
> > where): when using more threads, the parallel test does not complete and
> > stops with a timeout message, indicating that the number of expected
> > components are not created after a timeout delay of 1 minute.
> >
> > So, I just committed a modified version of the tool in the sandbox which
> > can now take a -Dthreads option in order to configure the number of
> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
> > not complete and ends with a timeout:
> >
> > $ java -Dthreads=10 -server -jar bin/felix.jar
> >
> > g! Starting benchmarks (each tested bundle will add/remove 630 components
> > during bundle activation).
> >
> >         [Starting benchmarks with no processing done in components start
> > methods]
> >
> > Benchmarking bundle:
> > org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> > .................................................Could not start
> components
> > timely: current start latch=2, stop latch=630
> >
> > My current understanding of this is that some components are still
> awaiting
> > for unsatisfied service dependencies, just like if a service tracker
> would
> > have missed a service registration.
> >
> > I ran the same test during two hours with the previous framework version,
> > and did not observe any problems.
> >
> > I wonder if someone else do have another tool in order to perform another
> > kind of load test, just to see if some problems are also observed.
> >
> > -> from  my side, I will do the following: in the past, the benchmark
> tool
> > supported not only dependencymanager, but also Felix SCR and iPojo. So, I
> > will reintroduce Felix SCR in the benchmark and will check if I also
> > observe the problem (with -Dthreads=10).
> >
> > I will let you know.
> >
> > cheers;
> > /Pierre
> >
> > [1]
> >
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
> >
> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> I've fixed this now in
> >> svn.apache.org/viewvc?view=revision&revision=1679367
> >>
> >> Pierre, your loadtest now runs to completion - thanks for reporting
> >> this issue! I can see that the results for the parallel tests are a
> >> little bit different than before, but I'm not sure how to read them so
> >> I'll leave the interpretation of that to you :)
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
> >> wrote:
> >> > I think I know what this is. I had some additional changes exactly in
> >> > this area that I simply forgot to apply this morning. I should have it
> >> > fixed sometime today.
> >> >
> >> > Cheers,
> >> >
> >> > David
> >> >
> >> > On 14 May 2015 at 14:03, David Bosschaert <david.bosschaert@gmail.com
> >
> >> wrote:
> >> >> Hi Pierre,
> >> >>
> >> >> I'll take a look today.
> >> >>
> >> >> Cheers,
> >> >>
> >> >> David
> >> >>
> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
> wrote:
> >> >>> I just committed the benchmark tool in
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
> >> can
> >> >>> take a look.
> >> >>>
> >> >>> To run the scenario:
> >> >>>
> >> >>> - install jdk8:
> >> >>>
> >> >>> [nxuser@nx0012 pderop]$ java -version
> >> >>> java version "1.8.0_40"
> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >> >>>
> >> >>> - checkout the loadtest from
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >> >>>
> >> >>> - go the the "loadtest" directory and start the test, just like
> this:
> >> >>>
> >> >>> $ java -server -jar bin/felix.jar
> >> >>> Welcome to Apache Felix Gogo
> >> >>>
> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> >> components
> >> >>> during bundle activation).
> >> >>>
> >> >>>         [Starting benchmarks with no processing done in components
> >> start
> >> >>> methods]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >> >>> ..................................................
> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> >> 319,631,722
> >> >>> | 919,838,078]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>
> >> >>>
> >> >>> Here, the first
> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
> test
> >> >>> (single-threaded) passes OK. But the next one hangs
> >> >>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >> >>> it uses a fork join pool with size=4.
> >> >>>
> >> >>> and when typing "log warn", we see:
> >> >>>
> >> >>> "log warn"
> >> >>>
> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>> java.util.ConcurrentModificationException
> >> >>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>         at
> >> >>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>         at
> >> >>>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>         at
> >> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>         at
> >> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>
> >> >>>
> >> >>> (I will investigate also in my code to check if the problem does not
> >> come
> >> >>> from me ?)
> >> >>>
> >> >>> cheers;
> >> >>> /Pierre
> >> >>>
> >> >>>
> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
> pierre.derop@gmail.com
> >> >
> >> >>> wrote:
> >> >>>
> >> >>>> Hi David,
> >> >>>>
> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
> >> is a
> >> >>>> regression somewhere in the framework, by my parallel test does not
> >> pass
> >> >>>> anymore.
> >> >>>>
> >> >>>> The test first starts with a single-threaded scenario, which
> passes OK
> >> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> >> then when
> >> >>>> the parallel test starts
> >> >>>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
> shell, I
> >> see
> >> >>>> the following exception:
> >> >>>>
> >> >>>> (I'm using java8):
> >> >>>>
> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >> >>>> ____________________________
> >> >>>> Welcome to Apache Felix Gogo
> >> >>>>
> >> >>>> Benchmarking bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>>
> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
> "log
> >> >>>> warn", I see this:)
> >> >>>>
> >> >>>> g! log warn
> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>>> java.util.ConcurrentModificationException
> >> >>>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>>         at
> >> >>>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>>         at
> >> >>>>
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>>         at
> >> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>>
> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >> >>>> threadpool=4, then I have the problem)
> >> >>>>
> >> >>>> I will investigate, but Ideally, may be it would be helpful if you
> >> could
> >> >>>> also run the test by yourself; so I will commit soon something to
> >> reproduce
> >> >>>> the problem in my sandbox.
> >> >>>>
> >> >>>> cheers;
> >> >>>> /Pierre
> >> >>>>
> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >> >>>> david.bosschaert@gmail.com> wrote:
> >> >>>>
> >> >>>>> I've committed this now in
> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >> >>>>>
> >> >>>>> Curious to see what others are measuring. My tests were focused on
> >> >>>>> multiple bundles/threads obtaining the same service, as that's
> were I
> >> >>>>> saw a bit of contention.
> >> >>>>>
> >> >>>>> Cheers,
> >> >>>>>
> >> >>>>> David
> >> >>>>>
> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
> >> wrote:
> >> >>>>> > Hi David,
> >> >>>>> >
> >> >>>>> > I'm looking forward to test your improvements using the
> >> >>>>> dependencymanager
> >> >>>>> > benchmark tool ([1]).
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > [1]
> >> >>>>> >
> >> >>>>>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >> >>>>> >
> >> >>>>> > /Pierre
> >> >>>>> >
> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >> >>>>> > david.bosschaert@gmail.com> wrote:
> >> >>>>> >
> >> >>>>> >> I have implemented the performance improvements that I was
> >> thinking of
> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >> >>>>> >>
> >> >>>>> >> I wrote a little performance test suite [2] that tests
> >> multithreaded
> >> >>>>> >> service registry performance (10 threads) from single /
> multiple
> >> >>>>> >> bundles with either singleton services and Prototype Service
> >> Factory
> >> >>>>> >> services and the results are quite impressive. I'm getting
> >> performance
> >> >>>>> >> improvements compared to the current trunk from 8 times better
> >> than
> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >> >>>>> >>
> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >> >>>>> >>
> >> >>>>> >> Cheers,
> >> >>>>> >>
> >> >>>>> >> David
> >> >>>>> >>
> >> >>>>> >> [1]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> >>>>> >> [2]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >> >>>>> >>
> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
> heavy@ungoverned.org>
> >> >>>>> wrote:
> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >>>>> >> >>
> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> >> heavy@ungoverned.org>
> >> >>>>> >> wrote:
> >> >>>>> >> >>>
> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >>>>> >> >>>>
> >> >>>>> >> >>>> There's a call to interrupt() in
> Felix#acquireBundleLock(),
> >> not
> >> >>>>> sure
> >> >>>>> >> if
> >> >>>>> >> >>>> it
> >> >>>>> >> >>>> can be the culprit though.
> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
> >> while
> >> >>>>> one
> >> >>>>> >> of
> >> >>>>> >> >>>> its
> >> >>>>> >> >>>> thread is waiting for a service, which should is a valid
> use
> >> case
> >> >>>>> >> imho.
> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
> interrupted
> >> >>>>> would be
> >> >>>>> >> >>>> good.
> >> >>>>> >> >>>
> >> >>>>> >> >>>
> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
> bundle
> >> lock
> >> >>>>> and
> >> >>>>> >> the
> >> >>>>> >> >>> global lock holder needs the bundle lock.
> >> >>>>> >> >>>
> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
> >> here, but
> >> >>>>> >> didn't
> >> >>>>> >> >>> we
> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> >> necessary? I
> >> >>>>> >> >>> thought
> >> >>>>> >> >>> we just checked for the validity of the bundle context
> before
> >> >>>>> returning
> >> >>>>> >> >>> or
> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> >> interrupted in
> >> >>>>> >> that
> >> >>>>> >> >>> case. I really don't know.
> >> >>>>> >> >>
> >> >>>>> >> >> I think that the Service Registry could be rewritten to be
> >> >>>>> completely
> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> >> libraries,
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
> yeah
> >> sure.
> >> >>>>> >> >
> >> >>>>> >> >> which I think would really be a better approach. There is
> too
> >> much
> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > I don't really think there is too much, but it is
> complicated.
> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
> aren't
> >> held
> >> >>>>> >> while
> >> >>>>> >> > do service lookups and this is complicated because you can
> run
> >> into
> >> >>>>> >> cycles,
> >> >>>>> >> > etc.
> >> >>>>> >> >
> >> >>>>> >> > But feel free to try to simplify it.
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
> >> for the
> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
> >> compatible
> >> >>>>> but
> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
> JDK
> >> that
> >> >>>>> >> >> went end-of-life 7 years ago.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > At this point, it doesn't really matter to me.
> >> >>>>> >> >
> >> >>>>> >> > -> richard
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> Best regards,
> >> >>>>> >> >>
> >> >>>>> >> >> David
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >>
> >> >>>>>
> >> >>>>
> >> >>>>
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
the threadump did not help.
I will  investigate (may be a bug somewhere in my part; if this is the
case, I would be sorry to make all this noise).

hope to let you know soon.

by the way, do you know how to run the SCR integration tests with the
framework from the trunk ? I know that there are some SCR integration tests
that are doing some load tests, and I would be interested to know if they
are also ok with the framework from the trunk ?

cheers;
/Pierre


On Thu, May 14, 2015 at 10:06 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> Hi Pierre,
>
> It would indeed be useful to find out more about why your test is
> hanging. Maybe analysing a threaddump might give some more
> information?
>
> Cheers,
>
> David
>
> On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
> > Thanks David; I just gave a try, and indeed the parallel test passed. I
> > observed a gain of around 7/10%. The tool is described in [1].
> >
> > But I only have 4 cores on my laptop and I will make more tests in my lab
> > at work (next week) where we have some servers having 32 or even 128
> > processors. This will give a better idea of the gain because the more
> > processor you have, the more synchronization is costly, so I could
> possibly
> > observe a better performance gain.
> >
> > Now, I'm sorry but I think that there is still a problem (I don't know
> > where): when using more threads, the parallel test does not complete and
> > stops with a timeout message, indicating that the number of expected
> > components are not created after a timeout delay of 1 minute.
> >
> > So, I just committed a modified version of the tool in the sandbox which
> > can now take a -Dthreads option in order to configure the number of
> > threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
> > not complete and ends with a timeout:
> >
> > $ java -Dthreads=10 -server -jar bin/felix.jar
> >
> > g! Starting benchmarks (each tested bundle will add/remove 630 components
> > during bundle activation).
> >
> >         [Starting benchmarks with no processing done in components start
> > methods]
> >
> > Benchmarking bundle:
> > org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> > .................................................Could not start
> components
> > timely: current start latch=2, stop latch=630
> >
> > My current understanding of this is that some components are still
> awaiting
> > for unsatisfied service dependencies, just like if a service tracker
> would
> > have missed a service registration.
> >
> > I ran the same test during two hours with the previous framework version,
> > and did not observe any problems.
> >
> > I wonder if someone else do have another tool in order to perform another
> > kind of load test, just to see if some problems are also observed.
> >
> > -> from  my side, I will do the following: in the past, the benchmark
> tool
> > supported not only dependencymanager, but also Felix SCR and iPojo. So, I
> > will reintroduce Felix SCR in the benchmark and will check if I also
> > observe the problem (with -Dthreads=10).
> >
> > I will let you know.
> >
> > cheers;
> > /Pierre
> >
> > [1]
> >
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
> >
> > On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> I've fixed this now in
> >> svn.apache.org/viewvc?view=revision&revision=1679367
> >>
> >> Pierre, your loadtest now runs to completion - thanks for reporting
> >> this issue! I can see that the results for the parallel tests are a
> >> little bit different than before, but I'm not sure how to read them so
> >> I'll leave the interpretation of that to you :)
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
> >> wrote:
> >> > I think I know what this is. I had some additional changes exactly in
> >> > this area that I simply forgot to apply this morning. I should have it
> >> > fixed sometime today.
> >> >
> >> > Cheers,
> >> >
> >> > David
> >> >
> >> > On 14 May 2015 at 14:03, David Bosschaert <david.bosschaert@gmail.com
> >
> >> wrote:
> >> >> Hi Pierre,
> >> >>
> >> >> I'll take a look today.
> >> >>
> >> >> Cheers,
> >> >>
> >> >> David
> >> >>
> >> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com>
> wrote:
> >> >>> I just committed the benchmark tool in
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
> >> can
> >> >>> take a look.
> >> >>>
> >> >>> To run the scenario:
> >> >>>
> >> >>> - install jdk8:
> >> >>>
> >> >>> [nxuser@nx0012 pderop]$ java -version
> >> >>> java version "1.8.0_40"
> >> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >> >>>
> >> >>> - checkout the loadtest from
> >> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >> >>>
> >> >>> - go the the "loadtest" directory and start the test, just like
> this:
> >> >>>
> >> >>> $ java -server -jar bin/felix.jar
> >> >>> Welcome to Apache Felix Gogo
> >> >>>
> >> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> >> components
> >> >>> during bundle activation).
> >> >>>
> >> >>>         [Starting benchmarks with no processing done in components
> >> start
> >> >>> methods]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >> >>> ..................................................
> >> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> >> 319,631,722
> >> >>> | 919,838,078]
> >> >>>
> >> >>> Benchmarking bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>
> >> >>>
> >> >>> Here, the first
> >> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager"
> test
> >> >>> (single-threaded) passes OK. But the next one hangs
> >> >>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >> >>> it uses a fork join pool with size=4.
> >> >>>
> >> >>> and when typing "log warn", we see:
> >> >>>
> >> >>> "log warn"
> >> >>>
> >> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >> >>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>> java.util.ConcurrentModificationException
> >> >>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>         at
> >> >>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>         at
> >> >>>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>         at
> >> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>         at
> >> >>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>         at
> >> >>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>         at
> >> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>         at
> >> >>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>
> >> >>>
> >> >>> (I will investigate also in my code to check if the problem does not
> >> come
> >> >>> from me ?)
> >> >>>
> >> >>> cheers;
> >> >>> /Pierre
> >> >>>
> >> >>>
> >> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <
> pierre.derop@gmail.com
> >> >
> >> >>> wrote:
> >> >>>
> >> >>>> Hi David,
> >> >>>>
> >> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
> >> is a
> >> >>>> regression somewhere in the framework, by my parallel test does not
> >> pass
> >> >>>> anymore.
> >> >>>>
> >> >>>> The test first starts with a single-threaded scenario, which
> passes OK
> >> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> >> then when
> >> >>>> the parallel test starts
> >> >>>>
> >>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >> >>>> it suddenly hangs, and when I type "log warn" under the gogo
> shell, I
> >> see
> >> >>>> the following exception:
> >> >>>>
> >> >>>> (I'm using java8):
> >> >>>>
> >> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >> >>>> ____________________________
> >> >>>> Welcome to Apache Felix Gogo
> >> >>>>
> >> >>>> Benchmarking bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .
> >> >>>>
> >> >>>> (here, the dependencymanager.parallel test hangs and when I type
> "log
> >> >>>> warn", I see this:)
> >> >>>>
> >> >>>> g! log warn
> >> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >> >>>>
> >> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> -
> >> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >> >>>> java.util.ConcurrentModificationException
> >> >>>>         at
> java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >> >>>>         at
> >> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >> >>>>         at
> >> >>>>
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >> >>>>         at
> >> >>>>
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >> >>>>         at
> >> >>>>
> >>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >> >>>>         at
> >> >>>>
> >> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >> >>>>         at
> >> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >> >>>>         at
> >> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >> >>>>         at
> >> >>>>
> >>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >> >>>>
> >> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >> >>>> threadpool=4, then I have the problem)
> >> >>>>
> >> >>>> I will investigate, but Ideally, may be it would be helpful if you
> >> could
> >> >>>> also run the test by yourself; so I will commit soon something to
> >> reproduce
> >> >>>> the problem in my sandbox.
> >> >>>>
> >> >>>> cheers;
> >> >>>> /Pierre
> >> >>>>
> >> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >> >>>> david.bosschaert@gmail.com> wrote:
> >> >>>>
> >> >>>>> I've committed this now in
> >> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >> >>>>>
> >> >>>>> Curious to see what others are measuring. My tests were focused on
> >> >>>>> multiple bundles/threads obtaining the same service, as that's
> were I
> >> >>>>> saw a bit of contention.
> >> >>>>>
> >> >>>>> Cheers,
> >> >>>>>
> >> >>>>> David
> >> >>>>>
> >> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
> >> wrote:
> >> >>>>> > Hi David,
> >> >>>>> >
> >> >>>>> > I'm looking forward to test your improvements using the
> >> >>>>> dependencymanager
> >> >>>>> > benchmark tool ([1]).
> >> >>>>> >
> >> >>>>> >
> >> >>>>> > [1]
> >> >>>>> >
> >> >>>>>
> >>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >> >>>>> >
> >> >>>>> > /Pierre
> >> >>>>> >
> >> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >> >>>>> > david.bosschaert@gmail.com> wrote:
> >> >>>>> >
> >> >>>>> >> I have implemented the performance improvements that I was
> >> thinking of
> >> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >> >>>>> >>
> >> >>>>> >> I wrote a little performance test suite [2] that tests
> >> multithreaded
> >> >>>>> >> service registry performance (10 threads) from single /
> multiple
> >> >>>>> >> bundles with either singleton services and Prototype Service
> >> Factory
> >> >>>>> >> services and the results are quite impressive. I'm getting
> >> performance
> >> >>>>> >> improvements compared to the current trunk from 8 times better
> >> than
> >> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >> >>>>> >>
> >> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
> >> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >> >>>>> >>
> >> >>>>> >> Cheers,
> >> >>>>> >>
> >> >>>>> >> David
> >> >>>>> >>
> >> >>>>> >> [1]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> >>>>> >> [2]
> >> >>>>> >>
> >> >>>>>
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >> >>>>> >>
> >> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <
> heavy@ungoverned.org>
> >> >>>>> wrote:
> >> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >>>>> >> >>
> >> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> >> heavy@ungoverned.org>
> >> >>>>> >> wrote:
> >> >>>>> >> >>>
> >> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >>>>> >> >>>>
> >> >>>>> >> >>>> There's a call to interrupt() in
> Felix#acquireBundleLock(),
> >> not
> >> >>>>> sure
> >> >>>>> >> if
> >> >>>>> >> >>>> it
> >> >>>>> >> >>>> can be the culprit though.
> >> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
> >> while
> >> >>>>> one
> >> >>>>> >> of
> >> >>>>> >> >>>> its
> >> >>>>> >> >>>> thread is waiting for a service, which should is a valid
> use
> >> case
> >> >>>>> >> imho.
> >> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being
> interrupted
> >> >>>>> would be
> >> >>>>> >> >>>> good.
> >> >>>>> >> >>>
> >> >>>>> >> >>>
> >> >>>>> >> >>> Yes, threads can be interrupted if they are holding a
> bundle
> >> lock
> >> >>>>> and
> >> >>>>> >> the
> >> >>>>> >> >>> global lock holder needs the bundle lock.
> >> >>>>> >> >>>
> >> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
> >> here, but
> >> >>>>> >> didn't
> >> >>>>> >> >>> we
> >> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> >> necessary? I
> >> >>>>> >> >>> thought
> >> >>>>> >> >>> we just checked for the validity of the bundle context
> before
> >> >>>>> returning
> >> >>>>> >> >>> or
> >> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> >> interrupted in
> >> >>>>> >> that
> >> >>>>> >> >>> case. I really don't know.
> >> >>>>> >> >>
> >> >>>>> >> >> I think that the Service Registry could be rewritten to be
> >> >>>>> completely
> >> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> >> libraries,
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > Well, that just moves the sync blocks to the library, but
> yeah
> >> sure.
> >> >>>>> >> >
> >> >>>>> >> >> which I think would really be a better approach. There is
> too
> >> much
> >> >>>>> >> >> locking going on in the current SR implementation IMHO.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > I don't really think there is too much, but it is
> complicated.
> >> >>>>> >> > Unfortunately, it is complicated to make sure that locks
> aren't
> >> held
> >> >>>>> >> while
> >> >>>>> >> > do service lookups and this is complicated because you can
> run
> >> into
> >> >>>>> >> cycles,
> >> >>>>> >> > etc.
> >> >>>>> >> >
> >> >>>>> >> > But feel free to try to simplify it.
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
> >> for the
> >> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
> >> compatible
> >> >>>>> but
> >> >>>>> >> >> I would be surprised if there is anyone who still needs a
> JDK
> >> that
> >> >>>>> >> >> went end-of-life 7 years ago.
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >> > At this point, it doesn't really matter to me.
> >> >>>>> >> >
> >> >>>>> >> > -> richard
> >> >>>>> >> >
> >> >>>>> >> >>
> >> >>>>> >> >> Best regards,
> >> >>>>> >> >>
> >> >>>>> >> >> David
> >> >>>>> >> >
> >> >>>>> >> >
> >> >>>>> >>
> >> >>>>>
> >> >>>>
> >> >>>>
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
Hi Pierre,

It would indeed be useful to find out more about why your test is
hanging. Maybe analysing a threaddump might give some more
information?

Cheers,

David

On 14 May 2015 at 19:54, Pierre De Rop <pi...@gmail.com> wrote:
> Thanks David; I just gave a try, and indeed the parallel test passed. I
> observed a gain of around 7/10%. The tool is described in [1].
>
> But I only have 4 cores on my laptop and I will make more tests in my lab
> at work (next week) where we have some servers having 32 or even 128
> processors. This will give a better idea of the gain because the more
> processor you have, the more synchronization is costly, so I could possibly
> observe a better performance gain.
>
> Now, I'm sorry but I think that there is still a problem (I don't know
> where): when using more threads, the parallel test does not complete and
> stops with a timeout message, indicating that the number of expected
> components are not created after a timeout delay of 1 minute.
>
> So, I just committed a modified version of the tool in the sandbox which
> can now take a -Dthreads option in order to configure the number of
> threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
> not complete and ends with a timeout:
>
> $ java -Dthreads=10 -server -jar bin/felix.jar
>
> g! Starting benchmarks (each tested bundle will add/remove 630 components
> during bundle activation).
>
>         [Starting benchmarks with no processing done in components start
> methods]
>
> Benchmarking bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
> .................................................Could not start components
> timely: current start latch=2, stop latch=630
>
> My current understanding of this is that some components are still awaiting
> for unsatisfied service dependencies, just like if a service tracker would
> have missed a service registration.
>
> I ran the same test during two hours with the previous framework version,
> and did not observe any problems.
>
> I wonder if someone else do have another tool in order to perform another
> kind of load test, just to see if some problems are also observed.
>
> -> from  my side, I will do the following: in the past, the benchmark tool
> supported not only dependencymanager, but also Felix SCR and iPojo. So, I
> will reintroduce Felix SCR in the benchmark and will check if I also
> observe the problem (with -Dthreads=10).
>
> I will let you know.
>
> cheers;
> /Pierre
>
> [1]
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README
>
> On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> I've fixed this now in
>> svn.apache.org/viewvc?view=revision&revision=1679367
>>
>> Pierre, your loadtest now runs to completion - thanks for reporting
>> this issue! I can see that the results for the parallel tests are a
>> little bit different than before, but I'm not sure how to read them so
>> I'll leave the interpretation of that to you :)
>>
>> Cheers,
>>
>> David
>>
>> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
>> wrote:
>> > I think I know what this is. I had some additional changes exactly in
>> > this area that I simply forgot to apply this morning. I should have it
>> > fixed sometime today.
>> >
>> > Cheers,
>> >
>> > David
>> >
>> > On 14 May 2015 at 14:03, David Bosschaert <da...@gmail.com>
>> wrote:
>> >> Hi Pierre,
>> >>
>> >> I'll take a look today.
>> >>
>> >> Cheers,
>> >>
>> >> David
>> >>
>> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
>> >>> I just committed the benchmark tool in
>> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
>> can
>> >>> take a look.
>> >>>
>> >>> To run the scenario:
>> >>>
>> >>> - install jdk8:
>> >>>
>> >>> [nxuser@nx0012 pderop]$ java -version
>> >>> java version "1.8.0_40"
>> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>> >>>
>> >>> - checkout the loadtest from
>> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>> >>>
>> >>> - go the the "loadtest" directory and start the test, just like this:
>> >>>
>> >>> $ java -server -jar bin/felix.jar
>> >>> Welcome to Apache Felix Gogo
>> >>>
>> >>> g! Starting benchmarks (each tested bundle will add/remove 630
>> components
>> >>> during bundle activation).
>> >>>
>> >>>         [Starting benchmarks with no processing done in components
>> start
>> >>> methods]
>> >>>
>> >>> Benchmarking bundle:
>> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>> >>> ..................................................
>> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
>> 319,631,722
>> >>> | 919,838,078]
>> >>>
>> >>> Benchmarking bundle:
>> >>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>> >>>
>> >>>
>> >>> Here, the first
>> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
>> >>> (single-threaded) passes OK. But the next one hangs
>> >>>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>> >>> it uses a fork join pool with size=4.
>> >>>
>> >>> and when typing "log warn", we see:
>> >>>
>> >>> "log warn"
>> >>>
>> >>> 2015.05.14 13:56:10 ERROR - Bundle:
>> >>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >>> java.util.ConcurrentModificationException
>> >>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >>>         at
>> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >>>         at
>> >>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >>>         at
>> >>>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >>>         at
>> >>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >>>         at
>> >>>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >>>         at
>> >>>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >>>         at
>> >>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >>>         at
>> >>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >>>         at
>> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >>>         at
>> >>>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >>>         at
>> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >>>         at
>> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >>>         at
>> >>>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >>>
>> >>>
>> >>> (I will investigate also in my code to check if the problem does not
>> come
>> >>> from me ?)
>> >>>
>> >>> cheers;
>> >>> /Pierre
>> >>>
>> >>>
>> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pierre.derop@gmail.com
>> >
>> >>> wrote:
>> >>>
>> >>>> Hi David,
>> >>>>
>> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
>> is a
>> >>>> regression somewhere in the framework, by my parallel test does not
>> pass
>> >>>> anymore.
>> >>>>
>> >>>> The test first starts with a single-threaded scenario, which passes OK
>> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
>> then when
>> >>>> the parallel test starts
>> >>>>
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>> >>>> it suddenly hangs, and when I type "log warn" under the gogo shell, I
>> see
>> >>>> the following exception:
>> >>>>
>> >>>> (I'm using java8):
>> >>>>
>> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>> >>>> ____________________________
>> >>>> Welcome to Apache Felix Gogo
>> >>>>
>> >>>> Benchmarking bundle:
>> >>>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>> >>>>
>> >>>> (here, the dependencymanager.parallel test hangs and when I type "log
>> >>>> warn", I see this:)
>> >>>>
>> >>>> g! log warn
>> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
>> >>>>
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> >>>> java.util.ConcurrentModificationException
>> >>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>> >>>>         at
>> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>> >>>>         at
>> >>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>> >>>>         at
>> >>>>
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>> >>>>         at
>> >>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>> >>>>         at
>> >>>>
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>> >>>>         at
>> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>> >>>>         at
>> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>> >>>>         at
>> >>>>
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>> >>>>
>> >>>> (If I configure my threadpool to 1, I have no problems, but with
>> >>>> threadpool=4, then I have the problem)
>> >>>>
>> >>>> I will investigate, but Ideally, may be it would be helpful if you
>> could
>> >>>> also run the test by yourself; so I will commit soon something to
>> reproduce
>> >>>> the problem in my sandbox.
>> >>>>
>> >>>> cheers;
>> >>>> /Pierre
>> >>>>
>> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>> >>>> david.bosschaert@gmail.com> wrote:
>> >>>>
>> >>>>> I've committed this now in
>> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>> >>>>>
>> >>>>> Curious to see what others are measuring. My tests were focused on
>> >>>>> multiple bundles/threads obtaining the same service, as that's were I
>> >>>>> saw a bit of contention.
>> >>>>>
>> >>>>> Cheers,
>> >>>>>
>> >>>>> David
>> >>>>>
>> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
>> wrote:
>> >>>>> > Hi David,
>> >>>>> >
>> >>>>> > I'm looking forward to test your improvements using the
>> >>>>> dependencymanager
>> >>>>> > benchmark tool ([1]).
>> >>>>> >
>> >>>>> >
>> >>>>> > [1]
>> >>>>> >
>> >>>>>
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>> >>>>> >
>> >>>>> > /Pierre
>> >>>>> >
>> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>> >>>>> > david.bosschaert@gmail.com> wrote:
>> >>>>> >
>> >>>>> >> I have implemented the performance improvements that I was
>> thinking of
>> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>> >>>>> >>
>> >>>>> >> I wrote a little performance test suite [2] that tests
>> multithreaded
>> >>>>> >> service registry performance (10 threads) from single / multiple
>> >>>>> >> bundles with either singleton services and Prototype Service
>> Factory
>> >>>>> >> services and the results are quite impressive. I'm getting
>> performance
>> >>>>> >> improvements compared to the current trunk from 8 times better
>> than
>> >>>>> >> the original (800%) to more than 30 times better (3000%).
>> >>>>> >>
>> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>> >>>>> >>
>> >>>>> >> Cheers,
>> >>>>> >>
>> >>>>> >> David
>> >>>>> >>
>> >>>>> >> [1]
>> >>>>> >>
>> >>>>>
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> >>>>> >> [2]
>> >>>>> >>
>> >>>>>
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>> >>>>> >>
>> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
>> >>>>> wrote:
>> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >>>>> >> >>
>> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
>> heavy@ungoverned.org>
>> >>>>> >> wrote:
>> >>>>> >> >>>
>> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >>>>> >> >>>>
>> >>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(),
>> not
>> >>>>> sure
>> >>>>> >> if
>> >>>>> >> >>>> it
>> >>>>> >> >>>> can be the culprit though.
>> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
>> while
>> >>>>> one
>> >>>>> >> of
>> >>>>> >> >>>> its
>> >>>>> >> >>>> thread is waiting for a service, which should is a valid use
>> case
>> >>>>> >> imho.
>> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
>> >>>>> would be
>> >>>>> >> >>>> good.
>> >>>>> >> >>>
>> >>>>> >> >>>
>> >>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle
>> lock
>> >>>>> and
>> >>>>> >> the
>> >>>>> >> >>> global lock holder needs the bundle lock.
>> >>>>> >> >>>
>> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
>> here, but
>> >>>>> >> didn't
>> >>>>> >> >>> we
>> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
>> necessary? I
>> >>>>> >> >>> thought
>> >>>>> >> >>> we just checked for the validity of the bundle context before
>> >>>>> returning
>> >>>>> >> >>> or
>> >>>>> >> >>> something. Perhaps we felt there was no reason to be
>> interrupted in
>> >>>>> >> that
>> >>>>> >> >>> case. I really don't know.
>> >>>>> >> >>
>> >>>>> >> >> I think that the Service Registry could be rewritten to be
>> >>>>> completely
>> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
>> libraries,
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > Well, that just moves the sync blocks to the library, but yeah
>> sure.
>> >>>>> >> >
>> >>>>> >> >> which I think would really be a better approach. There is too
>> much
>> >>>>> >> >> locking going on in the current SR implementation IMHO.
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > I don't really think there is too much, but it is complicated.
>> >>>>> >> > Unfortunately, it is complicated to make sure that locks aren't
>> held
>> >>>>> >> while
>> >>>>> >> > do service lookups and this is complicated because you can run
>> into
>> >>>>> >> cycles,
>> >>>>> >> > etc.
>> >>>>> >> >
>> >>>>> >> > But feel free to try to simplify it.
>> >>>>> >> >
>> >>>>> >> >>
>> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
>> for the
>> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
>> compatible
>> >>>>> but
>> >>>>> >> >> I would be surprised if there is anyone who still needs a JDK
>> that
>> >>>>> >> >> went end-of-life 7 years ago.
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >> > At this point, it doesn't really matter to me.
>> >>>>> >> >
>> >>>>> >> > -> richard
>> >>>>> >> >
>> >>>>> >> >>
>> >>>>> >> >> Best regards,
>> >>>>> >> >>
>> >>>>> >> >> David
>> >>>>> >> >
>> >>>>> >> >
>> >>>>> >>
>> >>>>>
>> >>>>
>> >>>>
>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
Thanks David; I just gave a try, and indeed the parallel test passed. I
observed a gain of around 7/10%. The tool is described in [1].

But I only have 4 cores on my laptop and I will make more tests in my lab
at work (next week) where we have some servers having 32 or even 128
processors. This will give a better idea of the gain because the more
processor you have, the more synchronization is costly, so I could possibly
observe a better performance gain.

Now, I'm sorry but I think that there is still a problem (I don't know
where): when using more threads, the parallel test does not complete and
stops with a timeout message, indicating that the number of expected
components are not created after a timeout delay of 1 minute.

So, I just committed a modified version of the tool in the sandbox which
can now take a -Dthreads option in order to configure the number of
threads. With -Dthreads=4, its OK. But with -Dthreads=10, then test does
not complete and ends with a timeout:

$ java -Dthreads=10 -server -jar bin/felix.jar

g! Starting benchmarks (each tested bundle will add/remove 630 components
during bundle activation).

        [Starting benchmarks with no processing done in components start
methods]

Benchmarking bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel
.................................................Could not start components
timely: current start latch=2, stop latch=630

My current understanding of this is that some components are still awaiting
for unsatisfied service dependencies, just like if a service tracker would
have missed a service registration.

I ran the same test during two hours with the previous framework version,
and did not observe any problems.

I wonder if someone else do have another tool in order to perform another
kind of load test, just to see if some problems are also observed.

-> from  my side, I will do the following: in the past, the benchmark tool
supported not only dependencymanager, but also Felix SCR and iPojo. So, I
will reintroduce Felix SCR in the benchmark and will check if I also
observe the problem (with -Dthreads=10).

I will let you know.

cheers;
/Pierre

[1]
http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/README

On Thu, May 14, 2015 at 3:41 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> I've fixed this now in
> svn.apache.org/viewvc?view=revision&revision=1679367
>
> Pierre, your loadtest now runs to completion - thanks for reporting
> this issue! I can see that the results for the parallel tests are a
> little bit different than before, but I'm not sure how to read them so
> I'll leave the interpretation of that to you :)
>
> Cheers,
>
> David
>
> On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com>
> wrote:
> > I think I know what this is. I had some additional changes exactly in
> > this area that I simply forgot to apply this morning. I should have it
> > fixed sometime today.
> >
> > Cheers,
> >
> > David
> >
> > On 14 May 2015 at 14:03, David Bosschaert <da...@gmail.com>
> wrote:
> >> Hi Pierre,
> >>
> >> I'll take a look today.
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
> >>> I just committed the benchmark tool in
> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you
> can
> >>> take a look.
> >>>
> >>> To run the scenario:
> >>>
> >>> - install jdk8:
> >>>
> >>> [nxuser@nx0012 pderop]$ java -version
> >>> java version "1.8.0_40"
> >>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> >>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
> >>>
> >>> - checkout the loadtest from
> >>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
> >>>
> >>> - go the the "loadtest" directory and start the test, just like this:
> >>>
> >>> $ java -server -jar bin/felix.jar
> >>> Welcome to Apache Felix Gogo
> >>>
> >>> g! Starting benchmarks (each tested bundle will add/remove 630
> components
> >>> during bundle activation).
> >>>
> >>>         [Starting benchmarks with no processing done in components
> start
> >>> methods]
> >>>
> >>> Benchmarking bundle:
> >>> org.apache.felix.dependencymanager.benchmark.dependencymanager
> >>> ..................................................
> >>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 |
> 319,631,722
> >>> | 919,838,078]
> >>>
> >>> Benchmarking bundle:
> >>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
> >>>
> >>>
> >>> Here, the first
> >>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
> >>> (single-threaded) passes OK. But the next one hangs
> >>>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> >>> it uses a fork join pool with size=4.
> >>>
> >>> and when typing "log warn", we see:
> >>>
> >>> "log warn"
> >>>
> >>> 2015.05.14 13:56:10 ERROR - Bundle:
> >>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> >>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >>> java.util.ConcurrentModificationException
> >>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >>>         at
> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >>>         at
> >>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >>>         at
> >>>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >>>         at
> >>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >>>         at
> >>>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >>>         at
> >>>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >>>         at
> >>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >>>         at
> >>>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >>>         at
> >>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >>>         at
> >>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >>>         at
> >>>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >>>         at
> >>>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >>>         at
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >>>         at
> >>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >>>         at
> >>>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >>>
> >>>
> >>> (I will investigate also in my code to check if the problem does not
> come
> >>> from me ?)
> >>>
> >>> cheers;
> >>> /Pierre
> >>>
> >>>
> >>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pierre.derop@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hi David,
> >>>>
> >>>> I don't know if it's me (a bug in my benchmark tool) or if if there
> is a
> >>>> regression somewhere in the framework, by my parallel test does not
> pass
> >>>> anymore.
> >>>>
> >>>> The test first starts with a single-threaded scenario, which passes OK
> >>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager),
> then when
> >>>> the parallel test starts
> >>>>
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> >>>> it suddenly hangs, and when I type "log warn" under the gogo shell, I
> see
> >>>> the following exception:
> >>>>
> >>>> (I'm using java8):
> >>>>
> >>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> >>>> ____________________________
> >>>> Welcome to Apache Felix Gogo
> >>>>
> >>>> Benchmarking bundle:
> >>>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
> >>>>
> >>>> (here, the dependencymanager.parallel test hangs and when I type "log
> >>>> warn", I see this:)
> >>>>
> >>>> g! log warn
> >>>> 2015.05.14 13:31:03 ERROR - Bundle:
> >>>>
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> >>>> [ForkJoinPool-1-worker-3] Error processing tasks -
> >>>> java.util.ConcurrentModificationException
> >>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> >>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
> >>>>         at
> java.util.AbstractCollection.addAll(AbstractCollection.java:343)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
> >>>>         at
> >>>>
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
> >>>>         at
> >>>>
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
> >>>>         at
> >>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
> >>>>         at
> >>>>
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
> >>>>         at
> >>>>
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
> >>>>         at
> >>>>
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
> >>>>         at
> >>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
> >>>>         at
> >>>>
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
> >>>>         at
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> >>>>         at
> >>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
> >>>>         at
> >>>>
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
> >>>>
> >>>> (If I configure my threadpool to 1, I have no problems, but with
> >>>> threadpool=4, then I have the problem)
> >>>>
> >>>> I will investigate, but Ideally, may be it would be helpful if you
> could
> >>>> also run the test by yourself; so I will commit soon something to
> reproduce
> >>>> the problem in my sandbox.
> >>>>
> >>>> cheers;
> >>>> /Pierre
> >>>>
> >>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> >>>> david.bosschaert@gmail.com> wrote:
> >>>>
> >>>>> I've committed this now in
> >>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
> >>>>>
> >>>>> Curious to see what others are measuring. My tests were focused on
> >>>>> multiple bundles/threads obtaining the same service, as that's were I
> >>>>> saw a bit of contention.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> David
> >>>>>
> >>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com>
> wrote:
> >>>>> > Hi David,
> >>>>> >
> >>>>> > I'm looking forward to test your improvements using the
> >>>>> dependencymanager
> >>>>> > benchmark tool ([1]).
> >>>>> >
> >>>>> >
> >>>>> > [1]
> >>>>> >
> >>>>>
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >>>>> >
> >>>>> > /Pierre
> >>>>> >
> >>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> >>>>> > david.bosschaert@gmail.com> wrote:
> >>>>> >
> >>>>> >> I have implemented the performance improvements that I was
> thinking of
> >>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
> >>>>> >>
> >>>>> >> I wrote a little performance test suite [2] that tests
> multithreaded
> >>>>> >> service registry performance (10 threads) from single / multiple
> >>>>> >> bundles with either singleton services and Prototype Service
> Factory
> >>>>> >> services and the results are quite impressive. I'm getting
> performance
> >>>>> >> improvements compared to the current trunk from 8 times better
> than
> >>>>> >> the original (800%) to more than 30 times better (3000%).
> >>>>> >>
> >>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
> >>>>> >> planning to commit it to Felix tomorrow if nobody objects.
> >>>>> >>
> >>>>> >> Cheers,
> >>>>> >>
> >>>>> >> David
> >>>>> >>
> >>>>> >> [1]
> >>>>> >>
> >>>>>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >>>>> >> [2]
> >>>>> >>
> >>>>>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >>>>> >>
> >>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
> >>>>> wrote:
> >>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >>>>> >> >>
> >>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <
> heavy@ungoverned.org>
> >>>>> >> wrote:
> >>>>> >> >>>
> >>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >>>>> >> >>>>
> >>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(),
> not
> >>>>> sure
> >>>>> >> if
> >>>>> >> >>>> it
> >>>>> >> >>>> can be the culprit though.
> >>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown
> while
> >>>>> one
> >>>>> >> of
> >>>>> >> >>>> its
> >>>>> >> >>>> thread is waiting for a service, which should is a valid use
> case
> >>>>> >> imho.
> >>>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
> >>>>> would be
> >>>>> >> >>>> good.
> >>>>> >> >>>
> >>>>> >> >>>
> >>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle
> lock
> >>>>> and
> >>>>> >> the
> >>>>> >> >>> global lock holder needs the bundle lock.
> >>>>> >> >>>
> >>>>> >> >>> I admit that I do not recall why we ignore the interrupt
> here, but
> >>>>> >> didn't
> >>>>> >> >>> we
> >>>>> >> >>> implement service lookup so that a bundle lock wasn't
> necessary? I
> >>>>> >> >>> thought
> >>>>> >> >>> we just checked for the validity of the bundle context before
> >>>>> returning
> >>>>> >> >>> or
> >>>>> >> >>> something. Perhaps we felt there was no reason to be
> interrupted in
> >>>>> >> that
> >>>>> >> >>> case. I really don't know.
> >>>>> >> >>
> >>>>> >> >> I think that the Service Registry could be rewritten to be
> >>>>> completely
> >>>>> >> >> free of synchronized blocks using the Java 5 concurrency
> libraries,
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > Well, that just moves the sync blocks to the library, but yeah
> sure.
> >>>>> >> >
> >>>>> >> >> which I think would really be a better approach. There is too
> much
> >>>>> >> >> locking going on in the current SR implementation IMHO.
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > I don't really think there is too much, but it is complicated.
> >>>>> >> > Unfortunately, it is complicated to make sure that locks aren't
> held
> >>>>> >> while
> >>>>> >> > do service lookups and this is complicated because you can run
> into
> >>>>> >> cycles,
> >>>>> >> > etc.
> >>>>> >> >
> >>>>> >> > But feel free to try to simplify it.
> >>>>> >> >
> >>>>> >> >>
> >>>>> >> >> This brings the question: can we move to Java 5 (or Java 6)
> for the
> >>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4
> compatible
> >>>>> but
> >>>>> >> >> I would be surprised if there is anyone who still needs a JDK
> that
> >>>>> >> >> went end-of-life 7 years ago.
> >>>>> >> >
> >>>>> >> >
> >>>>> >> > At this point, it doesn't really matter to me.
> >>>>> >> >
> >>>>> >> > -> richard
> >>>>> >> >
> >>>>> >> >>
> >>>>> >> >> Best regards,
> >>>>> >> >>
> >>>>> >> >> David
> >>>>> >> >
> >>>>> >> >
> >>>>> >>
> >>>>>
> >>>>
> >>>>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
I've fixed this now in svn.apache.org/viewvc?view=revision&revision=1679367

Pierre, your loadtest now runs to completion - thanks for reporting
this issue! I can see that the results for the parallel tests are a
little bit different than before, but I'm not sure how to read them so
I'll leave the interpretation of that to you :)

Cheers,

David

On 14 May 2015 at 14:38, David Bosschaert <da...@gmail.com> wrote:
> I think I know what this is. I had some additional changes exactly in
> this area that I simply forgot to apply this morning. I should have it
> fixed sometime today.
>
> Cheers,
>
> David
>
> On 14 May 2015 at 14:03, David Bosschaert <da...@gmail.com> wrote:
>> Hi Pierre,
>>
>> I'll take a look today.
>>
>> Cheers,
>>
>> David
>>
>> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
>>> I just committed the benchmark tool in
>>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you can
>>> take a look.
>>>
>>> To run the scenario:
>>>
>>> - install jdk8:
>>>
>>> [nxuser@nx0012 pderop]$ java -version
>>> java version "1.8.0_40"
>>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>>>
>>> - checkout the loadtest from
>>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>>>
>>> - go the the "loadtest" directory and start the test, just like this:
>>>
>>> $ java -server -jar bin/felix.jar
>>> Welcome to Apache Felix Gogo
>>>
>>> g! Starting benchmarks (each tested bundle will add/remove 630 components
>>> during bundle activation).
>>>
>>>         [Starting benchmarks with no processing done in components start
>>> methods]
>>>
>>> Benchmarking bundle:
>>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>>> ..................................................
>>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 | 319,631,722
>>> | 919,838,078]
>>>
>>> Benchmarking bundle:
>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>>>
>>>
>>> Here, the first
>>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
>>> (single-threaded) passes OK. But the next one hangs
>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>>> it uses a fork join pool with size=4.
>>>
>>> and when typing "log warn", we see:
>>>
>>> "log warn"
>>>
>>> 2015.05.14 13:56:10 ERROR - Bundle:
>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>>> java.util.ConcurrentModificationException
>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>>>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>>>         at
>>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>>>         at
>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>>>         at
>>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>>>         at
>>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>>>         at
>>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>>>         at
>>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>>>         at
>>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>>>         at
>>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>>>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>>>         at
>>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>>>         at
>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>>>         at
>>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>>>
>>>
>>> (I will investigate also in my code to check if the problem does not come
>>> from me ?)
>>>
>>> cheers;
>>> /Pierre
>>>
>>>
>>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pi...@gmail.com>
>>> wrote:
>>>
>>>> Hi David,
>>>>
>>>> I don't know if it's me (a bug in my benchmark tool) or if if there is a
>>>> regression somewhere in the framework, by my parallel test does not pass
>>>> anymore.
>>>>
>>>> The test first starts with a single-threaded scenario, which passes OK
>>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager), then when
>>>> the parallel test starts
>>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>>>> it suddenly hangs, and when I type "log warn" under the gogo shell, I see
>>>> the following exception:
>>>>
>>>> (I'm using java8):
>>>>
>>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>>>> ____________________________
>>>> Welcome to Apache Felix Gogo
>>>>
>>>> Benchmarking bundle:
>>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>>>>
>>>> (here, the dependencymanager.parallel test hangs and when I type "log
>>>> warn", I see this:)
>>>>
>>>> g! log warn
>>>> 2015.05.14 13:31:03 ERROR - Bundle:
>>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>>>> java.util.ConcurrentModificationException
>>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>>>>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>>>>         at
>>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>>>>         at
>>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>>>>         at
>>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>>>>         at
>>>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>>>>         at
>>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>>>>         at
>>>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>>>>         at
>>>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>>>>         at
>>>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>>>>         at
>>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>>>>         at
>>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>>>>         at
>>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>>>>         at
>>>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>>>>         at
>>>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>>>>         at
>>>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>>>>         at
>>>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>>>>         at
>>>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>>>>         at
>>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>>>>         at
>>>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>>>>         at
>>>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>>>>         at
>>>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>>>>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>>>>         at
>>>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>>>>         at
>>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>>>>         at
>>>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>>>>
>>>> (If I configure my threadpool to 1, I have no problems, but with
>>>> threadpool=4, then I have the problem)
>>>>
>>>> I will investigate, but Ideally, may be it would be helpful if you could
>>>> also run the test by yourself; so I will commit soon something to reproduce
>>>> the problem in my sandbox.
>>>>
>>>> cheers;
>>>> /Pierre
>>>>
>>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>>>> david.bosschaert@gmail.com> wrote:
>>>>
>>>>> I've committed this now in
>>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>>>>>
>>>>> Curious to see what others are measuring. My tests were focused on
>>>>> multiple bundles/threads obtaining the same service, as that's were I
>>>>> saw a bit of contention.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> David
>>>>>
>>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
>>>>> > Hi David,
>>>>> >
>>>>> > I'm looking forward to test your improvements using the
>>>>> dependencymanager
>>>>> > benchmark tool ([1]).
>>>>> >
>>>>> >
>>>>> > [1]
>>>>> >
>>>>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>>>>> >
>>>>> > /Pierre
>>>>> >
>>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>>>>> > david.bosschaert@gmail.com> wrote:
>>>>> >
>>>>> >> I have implemented the performance improvements that I was thinking of
>>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>>>>> >>
>>>>> >> I wrote a little performance test suite [2] that tests multithreaded
>>>>> >> service registry performance (10 threads) from single / multiple
>>>>> >> bundles with either singleton services and Prototype Service Factory
>>>>> >> services and the results are quite impressive. I'm getting performance
>>>>> >> improvements compared to the current trunk from 8 times better than
>>>>> >> the original (800%) to more than 30 times better (3000%).
>>>>> >>
>>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>>>>> >>
>>>>> >> Cheers,
>>>>> >>
>>>>> >> David
>>>>> >>
>>>>> >> [1]
>>>>> >>
>>>>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>>>>> >> [2]
>>>>> >>
>>>>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>>>>> >>
>>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
>>>>> wrote:
>>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>>>>> >> >>
>>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
>>>>> >> wrote:
>>>>> >> >>>
>>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>>>>> >> >>>>
>>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not
>>>>> sure
>>>>> >> if
>>>>> >> >>>> it
>>>>> >> >>>> can be the culprit though.
>>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown while
>>>>> one
>>>>> >> of
>>>>> >> >>>> its
>>>>> >> >>>> thread is waiting for a service, which should is a valid use case
>>>>> >> imho.
>>>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
>>>>> would be
>>>>> >> >>>> good.
>>>>> >> >>>
>>>>> >> >>>
>>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle lock
>>>>> and
>>>>> >> the
>>>>> >> >>> global lock holder needs the bundle lock.
>>>>> >> >>>
>>>>> >> >>> I admit that I do not recall why we ignore the interrupt here, but
>>>>> >> didn't
>>>>> >> >>> we
>>>>> >> >>> implement service lookup so that a bundle lock wasn't necessary? I
>>>>> >> >>> thought
>>>>> >> >>> we just checked for the validity of the bundle context before
>>>>> returning
>>>>> >> >>> or
>>>>> >> >>> something. Perhaps we felt there was no reason to be interrupted in
>>>>> >> that
>>>>> >> >>> case. I really don't know.
>>>>> >> >>
>>>>> >> >> I think that the Service Registry could be rewritten to be
>>>>> completely
>>>>> >> >> free of synchronized blocks using the Java 5 concurrency libraries,
>>>>> >> >
>>>>> >> >
>>>>> >> > Well, that just moves the sync blocks to the library, but yeah sure.
>>>>> >> >
>>>>> >> >> which I think would really be a better approach. There is too much
>>>>> >> >> locking going on in the current SR implementation IMHO.
>>>>> >> >
>>>>> >> >
>>>>> >> > I don't really think there is too much, but it is complicated.
>>>>> >> > Unfortunately, it is complicated to make sure that locks aren't held
>>>>> >> while
>>>>> >> > do service lookups and this is complicated because you can run into
>>>>> >> cycles,
>>>>> >> > etc.
>>>>> >> >
>>>>> >> > But feel free to try to simplify it.
>>>>> >> >
>>>>> >> >>
>>>>> >> >> This brings the question: can we move to Java 5 (or Java 6) for the
>>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible
>>>>> but
>>>>> >> >> I would be surprised if there is anyone who still needs a JDK that
>>>>> >> >> went end-of-life 7 years ago.
>>>>> >> >
>>>>> >> >
>>>>> >> > At this point, it doesn't really matter to me.
>>>>> >> >
>>>>> >> > -> richard
>>>>> >> >
>>>>> >> >>
>>>>> >> >> Best regards,
>>>>> >> >>
>>>>> >> >> David
>>>>> >> >
>>>>> >> >
>>>>> >>
>>>>>
>>>>
>>>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
I think I know what this is. I had some additional changes exactly in
this area that I simply forgot to apply this morning. I should have it
fixed sometime today.

Cheers,

David

On 14 May 2015 at 14:03, David Bosschaert <da...@gmail.com> wrote:
> Hi Pierre,
>
> I'll take a look today.
>
> Cheers,
>
> David
>
> On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
>> I just committed the benchmark tool in
>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you can
>> take a look.
>>
>> To run the scenario:
>>
>> - install jdk8:
>>
>> [nxuser@nx0012 pderop]$ java -version
>> java version "1.8.0_40"
>> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>>
>> - checkout the loadtest from
>> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>>
>> - go the the "loadtest" directory and start the test, just like this:
>>
>> $ java -server -jar bin/felix.jar
>> Welcome to Apache Felix Gogo
>>
>> g! Starting benchmarks (each tested bundle will add/remove 630 components
>> during bundle activation).
>>
>>         [Starting benchmarks with no processing done in components start
>> methods]
>>
>> Benchmarking bundle:
>> org.apache.felix.dependencymanager.benchmark.dependencymanager
>> ..................................................
>> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 | 319,631,722
>> | 919,838,078]
>>
>> Benchmarking bundle:
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>>
>>
>> Here, the first
>> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
>> (single-threaded) passes OK. But the next one hangs
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
>> it uses a fork join pool with size=4.
>>
>> and when typing "log warn", we see:
>>
>> "log warn"
>>
>> 2015.05.14 13:56:10 ERROR - Bundle:
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> java.util.ConcurrentModificationException
>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>>         at
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>>         at
>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>>         at
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>>         at
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>>         at
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>>         at
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>>         at
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>>         at
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>>         at
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>>         at
>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>>         at
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>>
>>
>> (I will investigate also in my code to check if the problem does not come
>> from me ?)
>>
>> cheers;
>> /Pierre
>>
>>
>> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pi...@gmail.com>
>> wrote:
>>
>>> Hi David,
>>>
>>> I don't know if it's me (a bug in my benchmark tool) or if if there is a
>>> regression somewhere in the framework, by my parallel test does not pass
>>> anymore.
>>>
>>> The test first starts with a single-threaded scenario, which passes OK
>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager), then when
>>> the parallel test starts
>>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>>> it suddenly hangs, and when I type "log warn" under the gogo shell, I see
>>> the following exception:
>>>
>>> (I'm using java8):
>>>
>>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>>> ____________________________
>>> Welcome to Apache Felix Gogo
>>>
>>> Benchmarking bundle:
>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>>>
>>> (here, the dependencymanager.parallel test hangs and when I type "log
>>> warn", I see this:)
>>>
>>> g! log warn
>>> 2015.05.14 13:31:03 ERROR - Bundle:
>>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>>> [ForkJoinPool-1-worker-3] Error processing tasks -
>>> java.util.ConcurrentModificationException
>>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>>>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>>>         at
>>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>>>         at
>>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>>>         at
>>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>>>         at
>>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>>>         at
>>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>>>         at
>>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>>>         at
>>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>>>         at
>>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>>>         at
>>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>>>         at
>>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>>>         at
>>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>>>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>>>         at
>>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>>>         at
>>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>>>         at
>>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>>>
>>> (If I configure my threadpool to 1, I have no problems, but with
>>> threadpool=4, then I have the problem)
>>>
>>> I will investigate, but Ideally, may be it would be helpful if you could
>>> also run the test by yourself; so I will commit soon something to reproduce
>>> the problem in my sandbox.
>>>
>>> cheers;
>>> /Pierre
>>>
>>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>>> david.bosschaert@gmail.com> wrote:
>>>
>>>> I've committed this now in
>>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>>>>
>>>> Curious to see what others are measuring. My tests were focused on
>>>> multiple bundles/threads obtaining the same service, as that's were I
>>>> saw a bit of contention.
>>>>
>>>> Cheers,
>>>>
>>>> David
>>>>
>>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
>>>> > Hi David,
>>>> >
>>>> > I'm looking forward to test your improvements using the
>>>> dependencymanager
>>>> > benchmark tool ([1]).
>>>> >
>>>> >
>>>> > [1]
>>>> >
>>>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>>>> >
>>>> > /Pierre
>>>> >
>>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>>>> > david.bosschaert@gmail.com> wrote:
>>>> >
>>>> >> I have implemented the performance improvements that I was thinking of
>>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>>>> >>
>>>> >> I wrote a little performance test suite [2] that tests multithreaded
>>>> >> service registry performance (10 threads) from single / multiple
>>>> >> bundles with either singleton services and Prototype Service Factory
>>>> >> services and the results are quite impressive. I'm getting performance
>>>> >> improvements compared to the current trunk from 8 times better than
>>>> >> the original (800%) to more than 30 times better (3000%).
>>>> >>
>>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>>>> >> planning to commit it to Felix tomorrow if nobody objects.
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> David
>>>> >>
>>>> >> [1]
>>>> >>
>>>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>>>> >> [2]
>>>> >>
>>>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>>>> >>
>>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
>>>> wrote:
>>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>>>> >> >>
>>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
>>>> >> wrote:
>>>> >> >>>
>>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>>>> >> >>>>
>>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not
>>>> sure
>>>> >> if
>>>> >> >>>> it
>>>> >> >>>> can be the culprit though.
>>>> >> >>>> Interrupts could also be caused by a bundle being shutdown while
>>>> one
>>>> >> of
>>>> >> >>>> its
>>>> >> >>>> thread is waiting for a service, which should is a valid use case
>>>> >> imho.
>>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
>>>> would be
>>>> >> >>>> good.
>>>> >> >>>
>>>> >> >>>
>>>> >> >>> Yes, threads can be interrupted if they are holding a bundle lock
>>>> and
>>>> >> the
>>>> >> >>> global lock holder needs the bundle lock.
>>>> >> >>>
>>>> >> >>> I admit that I do not recall why we ignore the interrupt here, but
>>>> >> didn't
>>>> >> >>> we
>>>> >> >>> implement service lookup so that a bundle lock wasn't necessary? I
>>>> >> >>> thought
>>>> >> >>> we just checked for the validity of the bundle context before
>>>> returning
>>>> >> >>> or
>>>> >> >>> something. Perhaps we felt there was no reason to be interrupted in
>>>> >> that
>>>> >> >>> case. I really don't know.
>>>> >> >>
>>>> >> >> I think that the Service Registry could be rewritten to be
>>>> completely
>>>> >> >> free of synchronized blocks using the Java 5 concurrency libraries,
>>>> >> >
>>>> >> >
>>>> >> > Well, that just moves the sync blocks to the library, but yeah sure.
>>>> >> >
>>>> >> >> which I think would really be a better approach. There is too much
>>>> >> >> locking going on in the current SR implementation IMHO.
>>>> >> >
>>>> >> >
>>>> >> > I don't really think there is too much, but it is complicated.
>>>> >> > Unfortunately, it is complicated to make sure that locks aren't held
>>>> >> while
>>>> >> > do service lookups and this is complicated because you can run into
>>>> >> cycles,
>>>> >> > etc.
>>>> >> >
>>>> >> > But feel free to try to simplify it.
>>>> >> >
>>>> >> >>
>>>> >> >> This brings the question: can we move to Java 5 (or Java 6) for the
>>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible
>>>> but
>>>> >> >> I would be surprised if there is anyone who still needs a JDK that
>>>> >> >> went end-of-life 7 years ago.
>>>> >> >
>>>> >> >
>>>> >> > At this point, it doesn't really matter to me.
>>>> >> >
>>>> >> > -> richard
>>>> >> >
>>>> >> >>
>>>> >> >> Best regards,
>>>> >> >>
>>>> >> >> David
>>>> >> >
>>>> >> >
>>>> >>
>>>>
>>>
>>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
Hi Pierre,

I'll take a look today.

Cheers,

David

On 14 May 2015 at 14:00, Pierre De Rop <pi...@gmail.com> wrote:
> I just committed the benchmark tool in
> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you can
> take a look.
>
> To run the scenario:
>
> - install jdk8:
>
> [nxuser@nx0012 pderop]$ java -version
> java version "1.8.0_40"
> Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
> Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
>
> - checkout the loadtest from
> http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/
>
> - go the the "loadtest" directory and start the test, just like this:
>
> $ java -server -jar bin/felix.jar
> Welcome to Apache Felix Gogo
>
> g! Starting benchmarks (each tested bundle will add/remove 630 components
> during bundle activation).
>
>         [Starting benchmarks with no processing done in components start
> methods]
>
> Benchmarking bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager
> ..................................................
> -> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 | 319,631,722
> | 919,838,078]
>
> Benchmarking bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>
>
> Here, the first
> "org.apache.felix.dependencymanager.benchmark.dependencymanager" test
> (single-threaded) passes OK. But the next one hangs
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
> it uses a fork join pool with size=4.
>
> and when typing "log warn", we see:
>
> "log warn"
>
> 2015.05.14 13:56:10 ERROR - Bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> [ForkJoinPool-1-worker-3] Error processing tasks -
> java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>         at
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>         at
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>         at
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>         at
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>         at
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>         at
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>         at
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>         at
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>         at
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>         at
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>         at
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>         at
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>         at
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>         at
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>         at
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>         at
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>
>
> (I will investigate also in my code to check if the problem does not come
> from me ?)
>
> cheers;
> /Pierre
>
>
> On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pi...@gmail.com>
> wrote:
>
>> Hi David,
>>
>> I don't know if it's me (a bug in my benchmark tool) or if if there is a
>> regression somewhere in the framework, by my parallel test does not pass
>> anymore.
>>
>> The test first starts with a single-threaded scenario, which passes OK
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager), then when
>> the parallel test starts
>> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
>> it suddenly hangs, and when I type "log warn" under the gogo shell, I see
>> the following exception:
>>
>> (I'm using java8):
>>
>> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
>> ____________________________
>> Welcome to Apache Felix Gogo
>>
>> Benchmarking bundle:
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>>
>> (here, the dependencymanager.parallel test hangs and when I type "log
>> warn", I see this:)
>>
>> g! log warn
>> 2015.05.14 13:31:03 ERROR - Bundle:
>> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
>> [ForkJoinPool-1-worker-3] Error processing tasks -
>> java.util.ConcurrentModificationException
>>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>>         at
>> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>>         at
>> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>>         at
>> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>>         at
>> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>>         at
>> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>>         at
>> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>>         at
>> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>>         at
>> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>>         at
>> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>>         at
>> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>>         at
>> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>>         at
>> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>>         at
>> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>>         at
>> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>>
>> (If I configure my threadpool to 1, I have no problems, but with
>> threadpool=4, then I have the problem)
>>
>> I will investigate, but Ideally, may be it would be helpful if you could
>> also run the test by yourself; so I will commit soon something to reproduce
>> the problem in my sandbox.
>>
>> cheers;
>> /Pierre
>>
>> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
>> david.bosschaert@gmail.com> wrote:
>>
>>> I've committed this now in
>>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>>>
>>> Curious to see what others are measuring. My tests were focused on
>>> multiple bundles/threads obtaining the same service, as that's were I
>>> saw a bit of contention.
>>>
>>> Cheers,
>>>
>>> David
>>>
>>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
>>> > Hi David,
>>> >
>>> > I'm looking forward to test your improvements using the
>>> dependencymanager
>>> > benchmark tool ([1]).
>>> >
>>> >
>>> > [1]
>>> >
>>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>>> >
>>> > /Pierre
>>> >
>>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>>> > david.bosschaert@gmail.com> wrote:
>>> >
>>> >> I have implemented the performance improvements that I was thinking of
>>> >> using Java 5 concurrency tools, they can be viewed at [1].
>>> >>
>>> >> I wrote a little performance test suite [2] that tests multithreaded
>>> >> service registry performance (10 threads) from single / multiple
>>> >> bundles with either singleton services and Prototype Service Factory
>>> >> services and the results are quite impressive. I'm getting performance
>>> >> improvements compared to the current trunk from 8 times better than
>>> >> the original (800%) to more than 30 times better (3000%).
>>> >>
>>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>>> >> planning to commit it to Felix tomorrow if nobody objects.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> David
>>> >>
>>> >> [1]
>>> >>
>>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>>> >> [2]
>>> >>
>>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>>> >>
>>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
>>> wrote:
>>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>>> >> >>
>>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
>>> >> wrote:
>>> >> >>>
>>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>>> >> >>>>
>>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not
>>> sure
>>> >> if
>>> >> >>>> it
>>> >> >>>> can be the culprit though.
>>> >> >>>> Interrupts could also be caused by a bundle being shutdown while
>>> one
>>> >> of
>>> >> >>>> its
>>> >> >>>> thread is waiting for a service, which should is a valid use case
>>> >> imho.
>>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
>>> would be
>>> >> >>>> good.
>>> >> >>>
>>> >> >>>
>>> >> >>> Yes, threads can be interrupted if they are holding a bundle lock
>>> and
>>> >> the
>>> >> >>> global lock holder needs the bundle lock.
>>> >> >>>
>>> >> >>> I admit that I do not recall why we ignore the interrupt here, but
>>> >> didn't
>>> >> >>> we
>>> >> >>> implement service lookup so that a bundle lock wasn't necessary? I
>>> >> >>> thought
>>> >> >>> we just checked for the validity of the bundle context before
>>> returning
>>> >> >>> or
>>> >> >>> something. Perhaps we felt there was no reason to be interrupted in
>>> >> that
>>> >> >>> case. I really don't know.
>>> >> >>
>>> >> >> I think that the Service Registry could be rewritten to be
>>> completely
>>> >> >> free of synchronized blocks using the Java 5 concurrency libraries,
>>> >> >
>>> >> >
>>> >> > Well, that just moves the sync blocks to the library, but yeah sure.
>>> >> >
>>> >> >> which I think would really be a better approach. There is too much
>>> >> >> locking going on in the current SR implementation IMHO.
>>> >> >
>>> >> >
>>> >> > I don't really think there is too much, but it is complicated.
>>> >> > Unfortunately, it is complicated to make sure that locks aren't held
>>> >> while
>>> >> > do service lookups and this is complicated because you can run into
>>> >> cycles,
>>> >> > etc.
>>> >> >
>>> >> > But feel free to try to simplify it.
>>> >> >
>>> >> >>
>>> >> >> This brings the question: can we move to Java 5 (or Java 6) for the
>>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible
>>> but
>>> >> >> I would be surprised if there is anyone who still needs a JDK that
>>> >> >> went end-of-life 7 years ago.
>>> >> >
>>> >> >
>>> >> > At this point, it doesn't really matter to me.
>>> >> >
>>> >> > -> richard
>>> >> >
>>> >> >>
>>> >> >> Best regards,
>>> >> >>
>>> >> >> David
>>> >> >
>>> >> >
>>> >>
>>>
>>
>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
I just committed the benchmark tool in
http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/, if you can
take a look.

To run the scenario:

- install jdk8:

[nxuser@nx0012 pderop]$ java -version
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)

- checkout the loadtest from
http://svn.apache.org/viewvc/felix/sandbox/pderop/loadtest/

- go the the "loadtest" directory and start the test, just like this:

$ java -server -jar bin/felix.jar
Welcome to Apache Felix Gogo

g! Starting benchmarks (each tested bundle will add/remove 630 components
during bundle activation).

        [Starting benchmarks with no processing done in components start
methods]

Benchmarking bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager
..................................................
-> results in nanos: [139,129,744 | 143,957,687 | 152,157,581 | 319,631,722
| 919,838,078]

Benchmarking bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .


Here, the first
"org.apache.felix.dependencymanager.benchmark.dependencymanager" test
(single-threaded) passes OK. But the next one hangs
(org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel).
it uses a fork join pool with size=4.

and when typing "log warn", we see:

"log warn"

2015.05.14 13:56:10 ERROR - Bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
[ForkJoinPool-1-worker-3] Error processing tasks -
java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
        at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
        at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
        at
org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
        at
org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
        at
org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
        at
org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
        at
org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
        at
org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
        at
org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
        at
org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
        at
org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
        at
org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
        at
org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
        at
org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
        at
org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
        at
java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
        at
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)


(I will investigate also in my code to check if the problem does not come
from me ?)

cheers;
/Pierre


On Thu, May 14, 2015 at 1:47 PM, Pierre De Rop <pi...@gmail.com>
wrote:

> Hi David,
>
> I don't know if it's me (a bug in my benchmark tool) or if if there is a
> regression somewhere in the framework, by my parallel test does not pass
> anymore.
>
> The test first starts with a single-threaded scenario, which passes OK
> (org.apache.felix.dependencymanager.benchmark.dependencymanager), then when
> the parallel test starts
> (org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
> it suddenly hangs, and when I type "log warn" under the gogo shell, I see
> the following exception:
>
> (I'm using java8):
>
> $ java -server -Xmx4g -Xms4g -jar bin/felix.jar
> ____________________________
> Welcome to Apache Felix Gogo
>
> Benchmarking bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .
>
> (here, the dependencymanager.parallel test hangs and when I type "log
> warn", I see this:)
>
> g! log warn
> 2015.05.14 13:31:03 ERROR - Bundle:
> org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
> [ForkJoinPool-1-worker-3] Error processing tasks -
> java.util.ConcurrentModificationException
>         at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>         at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
>         at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
>         at
> org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
>         at
> org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
>         at
> org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
>         at
> org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
>         at
> org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
>         at
> org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
>         at
> org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
>         at
> org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
>         at
> org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
>         at
> org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
>         at
> org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
>         at
> org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
>         at
> org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
>         at
> org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
>         at
> java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
>         at
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
>         at
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
>         at
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
>
> (If I configure my threadpool to 1, I have no problems, but with
> threadpool=4, then I have the problem)
>
> I will investigate, but Ideally, may be it would be helpful if you could
> also run the test by yourself; so I will commit soon something to reproduce
> the problem in my sandbox.
>
> cheers;
> /Pierre
>
> On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> I've committed this now in
>> http://svn.apache.org/viewvc?view=revision&revision=1679327
>>
>> Curious to see what others are measuring. My tests were focused on
>> multiple bundles/threads obtaining the same service, as that's were I
>> saw a bit of contention.
>>
>> Cheers,
>>
>> David
>>
>> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
>> > Hi David,
>> >
>> > I'm looking forward to test your improvements using the
>> dependencymanager
>> > benchmark tool ([1]).
>> >
>> >
>> > [1]
>> >
>> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>> >
>> > /Pierre
>> >
>> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
>> > david.bosschaert@gmail.com> wrote:
>> >
>> >> I have implemented the performance improvements that I was thinking of
>> >> using Java 5 concurrency tools, they can be viewed at [1].
>> >>
>> >> I wrote a little performance test suite [2] that tests multithreaded
>> >> service registry performance (10 threads) from single / multiple
>> >> bundles with either singleton services and Prototype Service Factory
>> >> services and the results are quite impressive. I'm getting performance
>> >> improvements compared to the current trunk from 8 times better than
>> >> the original (800%) to more than 30 times better (3000%).
>> >>
>> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
>> >> planning to commit it to Felix tomorrow if nobody objects.
>> >>
>> >> Cheers,
>> >>
>> >> David
>> >>
>> >> [1]
>> >>
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> >> [2]
>> >>
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>> >>
>> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
>> wrote:
>> >> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >> >>
>> >> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
>> >> wrote:
>> >> >>>
>> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >> >>>>
>> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not
>> sure
>> >> if
>> >> >>>> it
>> >> >>>> can be the culprit though.
>> >> >>>> Interrupts could also be caused by a bundle being shutdown while
>> one
>> >> of
>> >> >>>> its
>> >> >>>> thread is waiting for a service, which should is a valid use case
>> >> imho.
>> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
>> would be
>> >> >>>> good.
>> >> >>>
>> >> >>>
>> >> >>> Yes, threads can be interrupted if they are holding a bundle lock
>> and
>> >> the
>> >> >>> global lock holder needs the bundle lock.
>> >> >>>
>> >> >>> I admit that I do not recall why we ignore the interrupt here, but
>> >> didn't
>> >> >>> we
>> >> >>> implement service lookup so that a bundle lock wasn't necessary? I
>> >> >>> thought
>> >> >>> we just checked for the validity of the bundle context before
>> returning
>> >> >>> or
>> >> >>> something. Perhaps we felt there was no reason to be interrupted in
>> >> that
>> >> >>> case. I really don't know.
>> >> >>
>> >> >> I think that the Service Registry could be rewritten to be
>> completely
>> >> >> free of synchronized blocks using the Java 5 concurrency libraries,
>> >> >
>> >> >
>> >> > Well, that just moves the sync blocks to the library, but yeah sure.
>> >> >
>> >> >> which I think would really be a better approach. There is too much
>> >> >> locking going on in the current SR implementation IMHO.
>> >> >
>> >> >
>> >> > I don't really think there is too much, but it is complicated.
>> >> > Unfortunately, it is complicated to make sure that locks aren't held
>> >> while
>> >> > do service lookups and this is complicated because you can run into
>> >> cycles,
>> >> > etc.
>> >> >
>> >> > But feel free to try to simplify it.
>> >> >
>> >> >>
>> >> >> This brings the question: can we move to Java 5 (or Java 6) for the
>> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible
>> but
>> >> >> I would be surprised if there is anyone who still needs a JDK that
>> >> >> went end-of-life 7 years ago.
>> >> >
>> >> >
>> >> > At this point, it doesn't really matter to me.
>> >> >
>> >> > -> richard
>> >> >
>> >> >>
>> >> >> Best regards,
>> >> >>
>> >> >> David
>> >> >
>> >> >
>> >>
>>
>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
Hi David,

I don't know if it's me (a bug in my benchmark tool) or if if there is a
regression somewhere in the framework, by my parallel test does not pass
anymore.

The test first starts with a single-threaded scenario, which passes OK
(org.apache.felix.dependencymanager.benchmark.dependencymanager), then when
the parallel test starts
(org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel)
it suddenly hangs, and when I type "log warn" under the gogo shell, I see
the following exception:

(I'm using java8):

$ java -server -Xmx4g -Xms4g -jar bin/felix.jar
____________________________
Welcome to Apache Felix Gogo

Benchmarking bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel .

(here, the dependencymanager.parallel test hangs and when I type "log
warn", I see this:)

g! log warn
2015.05.14 13:31:03 ERROR - Bundle:
org.apache.felix.dependencymanager.benchmark.dependencymanager.parallel -
[ForkJoinPool-1-worker-3] Error processing tasks -
java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
        at java.util.HashMap$KeyIterator.next(HashMap.java:1453)
        at java.util.AbstractCollection.addAll(AbstractCollection.java:343)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:245)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:212)
        at
org.apache.felix.framework.capabilityset.CapabilitySet.match(CapabilitySet.java:189)
        at
org.apache.felix.framework.ServiceRegistry.getServiceReferences(ServiceRegistry.java:269)
        at
org.apache.felix.framework.Felix.getServiceReferences(Felix.java:3577)
        at
org.apache.felix.framework.Felix.getAllowedServiceReferences(Felix.java:3655)
        at
org.apache.felix.framework.BundleContextImpl.getServiceReferences(BundleContextImpl.java:434)
        at
org.apache.felix.dm.tracker.ServiceTracker.getInitialReferences(ServiceTracker.java:422)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:375)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:319)
        at
org.apache.felix.dm.tracker.ServiceTracker.open(ServiceTracker.java:295)
        at
org.apache.felix.dm.impl.ServiceDependencyImpl.start(ServiceDependencyImpl.java:226)
        at
org.apache.felix.dm.impl.ComponentImpl.startDependencies(ComponentImpl.java:657)
        at
org.apache.felix.dm.impl.ComponentImpl.performTransition(ComponentImpl.java:535)
        at
org.apache.felix.dm.impl.ComponentImpl.handleChange(ComponentImpl.java:492)
        at
org.apache.felix.dm.impl.ComponentImpl.access$5(ComponentImpl.java:482)
        at
org.apache.felix.dm.impl.ComponentImpl$3.run(ComponentImpl.java:227)
        at
org.apache.felix.dm.impl.DispatchExecutor.runTask(DispatchExecutor.java:182)
        at
org.apache.felix.dm.impl.DispatchExecutor.run(DispatchExecutor.java:165)
        at
java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1402)
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
        at
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
        at
java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1689)
        at
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

(If I configure my threadpool to 1, I have no problems, but with
threadpool=4, then I have the problem)

I will investigate, but Ideally, may be it would be helpful if you could
also run the test by yourself; so I will commit soon something to reproduce
the problem in my sandbox.

cheers;
/Pierre

On Thu, May 14, 2015 at 11:11 AM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> I've committed this now in
> http://svn.apache.org/viewvc?view=revision&revision=1679327
>
> Curious to see what others are measuring. My tests were focused on
> multiple bundles/threads obtaining the same service, as that's were I
> saw a bit of contention.
>
> Cheers,
>
> David
>
> On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
> > Hi David,
> >
> > I'm looking forward to test your improvements using the dependencymanager
> > benchmark tool ([1]).
> >
> >
> > [1]
> >
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
> >
> > /Pierre
> >
> > On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> > david.bosschaert@gmail.com> wrote:
> >
> >> I have implemented the performance improvements that I was thinking of
> >> using Java 5 concurrency tools, they can be viewed at [1].
> >>
> >> I wrote a little performance test suite [2] that tests multithreaded
> >> service registry performance (10 threads) from single / multiple
> >> bundles with either singleton services and Prototype Service Factory
> >> services and the results are quite impressive. I'm getting performance
> >> improvements compared to the current trunk from 8 times better than
> >> the original (800%) to more than 30 times better (3000%).
> >>
> >> Carsten has already reviewed the code (thanks Carsten!) and I'm
> >> planning to commit it to Felix tomorrow if nobody objects.
> >>
> >> Cheers,
> >>
> >> David
> >>
> >> [1]
> >>
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> >> [2]
> >>
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
> >>
> >> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org>
> wrote:
> >> > On 3/23/15 10:17 , David Bosschaert wrote:
> >> >>
> >> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
> >> wrote:
> >> >>>
> >> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >> >>>>
> >> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not
> sure
> >> if
> >> >>>> it
> >> >>>> can be the culprit though.
> >> >>>> Interrupts could also be caused by a bundle being shutdown while
> one
> >> of
> >> >>>> its
> >> >>>> thread is waiting for a service, which should is a valid use case
> >> imho.
> >> >>>> Anyway, I think sanely reacting to a thread being interrupted
> would be
> >> >>>> good.
> >> >>>
> >> >>>
> >> >>> Yes, threads can be interrupted if they are holding a bundle lock
> and
> >> the
> >> >>> global lock holder needs the bundle lock.
> >> >>>
> >> >>> I admit that I do not recall why we ignore the interrupt here, but
> >> didn't
> >> >>> we
> >> >>> implement service lookup so that a bundle lock wasn't necessary? I
> >> >>> thought
> >> >>> we just checked for the validity of the bundle context before
> returning
> >> >>> or
> >> >>> something. Perhaps we felt there was no reason to be interrupted in
> >> that
> >> >>> case. I really don't know.
> >> >>
> >> >> I think that the Service Registry could be rewritten to be completely
> >> >> free of synchronized blocks using the Java 5 concurrency libraries,
> >> >
> >> >
> >> > Well, that just moves the sync blocks to the library, but yeah sure.
> >> >
> >> >> which I think would really be a better approach. There is too much
> >> >> locking going on in the current SR implementation IMHO.
> >> >
> >> >
> >> > I don't really think there is too much, but it is complicated.
> >> > Unfortunately, it is complicated to make sure that locks aren't held
> >> while
> >> > do service lookups and this is complicated because you can run into
> >> cycles,
> >> > etc.
> >> >
> >> > But feel free to try to simplify it.
> >> >
> >> >>
> >> >> This brings the question: can we move to Java 5 (or Java 6) for the
> >> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible
> but
> >> >> I would be surprised if there is anyone who still needs a JDK that
> >> >> went end-of-life 7 years ago.
> >> >
> >> >
> >> > At this point, it doesn't really matter to me.
> >> >
> >> > -> richard
> >> >
> >> >>
> >> >> Best regards,
> >> >>
> >> >> David
> >> >
> >> >
> >>
>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by David Bosschaert <da...@gmail.com>.
I've committed this now in
http://svn.apache.org/viewvc?view=revision&revision=1679327

Curious to see what others are measuring. My tests were focused on
multiple bundles/threads obtaining the same service, as that's were I
saw a bit of contention.

Cheers,

David

On 13 May 2015 at 15:10, Pierre De Rop <pi...@gmail.com> wrote:
> Hi David,
>
> I'm looking forward to test your improvements using the dependencymanager
> benchmark tool ([1]).
>
>
> [1]
> http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/
>
> /Pierre
>
> On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
> david.bosschaert@gmail.com> wrote:
>
>> I have implemented the performance improvements that I was thinking of
>> using Java 5 concurrency tools, they can be viewed at [1].
>>
>> I wrote a little performance test suite [2] that tests multithreaded
>> service registry performance (10 threads) from single / multiple
>> bundles with either singleton services and Prototype Service Factory
>> services and the results are quite impressive. I'm getting performance
>> improvements compared to the current trunk from 8 times better than
>> the original (800%) to more than 30 times better (3000%).
>>
>> Carsten has already reviewed the code (thanks Carsten!) and I'm
>> planning to commit it to Felix tomorrow if nobody objects.
>>
>> Cheers,
>>
>> David
>>
>> [1]
>> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
>> [2]
>> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>>
>> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org> wrote:
>> > On 3/23/15 10:17 , David Bosschaert wrote:
>> >>
>> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
>> wrote:
>> >>>
>> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
>> >>>>
>> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not sure
>> if
>> >>>> it
>> >>>> can be the culprit though.
>> >>>> Interrupts could also be caused by a bundle being shutdown while one
>> of
>> >>>> its
>> >>>> thread is waiting for a service, which should is a valid use case
>> imho.
>> >>>> Anyway, I think sanely reacting to a thread being interrupted would be
>> >>>> good.
>> >>>
>> >>>
>> >>> Yes, threads can be interrupted if they are holding a bundle lock and
>> the
>> >>> global lock holder needs the bundle lock.
>> >>>
>> >>> I admit that I do not recall why we ignore the interrupt here, but
>> didn't
>> >>> we
>> >>> implement service lookup so that a bundle lock wasn't necessary? I
>> >>> thought
>> >>> we just checked for the validity of the bundle context before returning
>> >>> or
>> >>> something. Perhaps we felt there was no reason to be interrupted in
>> that
>> >>> case. I really don't know.
>> >>
>> >> I think that the Service Registry could be rewritten to be completely
>> >> free of synchronized blocks using the Java 5 concurrency libraries,
>> >
>> >
>> > Well, that just moves the sync blocks to the library, but yeah sure.
>> >
>> >> which I think would really be a better approach. There is too much
>> >> locking going on in the current SR implementation IMHO.
>> >
>> >
>> > I don't really think there is too much, but it is complicated.
>> > Unfortunately, it is complicated to make sure that locks aren't held
>> while
>> > do service lookups and this is complicated because you can run into
>> cycles,
>> > etc.
>> >
>> > But feel free to try to simplify it.
>> >
>> >>
>> >> This brings the question: can we move to Java 5 (or Java 6) for the
>> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible but
>> >> I would be surprised if there is anyone who still needs a JDK that
>> >> went end-of-life 7 years ago.
>> >
>> >
>> > At this point, it doesn't really matter to me.
>> >
>> > -> richard
>> >
>> >>
>> >> Best regards,
>> >>
>> >> David
>> >
>> >
>>

Re: Service Registry refactor using Java-5 concurrency libraries (Was Re: [Framework] ServiceRegistry.getService() endless loop with lock?)

Posted by Pierre De Rop <pi...@gmail.com>.
Hi David,

I'm looking forward to test your improvements using the dependencymanager
benchmark tool ([1]).


[1]
http://svn.apache.org/viewvc/felix/trunk/dependencymanager/org.apache.felix.dependencymanager.benchmark/

/Pierre

On Wed, May 13, 2015 at 3:02 PM, David Bosschaert <
david.bosschaert@gmail.com> wrote:

> I have implemented the performance improvements that I was thinking of
> using Java 5 concurrency tools, they can be viewed at [1].
>
> I wrote a little performance test suite [2] that tests multithreaded
> service registry performance (10 threads) from single / multiple
> bundles with either singleton services and Prototype Service Factory
> services and the results are quite impressive. I'm getting performance
> improvements compared to the current trunk from 8 times better than
> the original (800%) to more than 30 times better (3000%).
>
> Carsten has already reviewed the code (thanks Carsten!) and I'm
> planning to commit it to Felix tomorrow if nobody objects.
>
> Cheers,
>
> David
>
> [1]
> https://github.com/bosschaert/felix/commit/e6a1b06c6e66d9c98e6d81b91ef7003c8e725450
> [2]
> https://github.com/bosschaert/coderthoughts/tree/master/service-registry-perftest/srperf
>
> On 23 March 2015 at 15:39, Richard S. Hall <he...@ungoverned.org> wrote:
> > On 3/23/15 10:17 , David Bosschaert wrote:
> >>
> >> On 23 March 2015 at 13:39, Richard S. Hall <he...@ungoverned.org>
> wrote:
> >>>
> >>> On 3/23/15 03:55 , Guillaume Nodet wrote:
> >>>>
> >>>> There's a call to interrupt() in Felix#acquireBundleLock(), not sure
> if
> >>>> it
> >>>> can be the culprit though.
> >>>> Interrupts could also be caused by a bundle being shutdown while one
> of
> >>>> its
> >>>> thread is waiting for a service, which should is a valid use case
> imho.
> >>>> Anyway, I think sanely reacting to a thread being interrupted would be
> >>>> good.
> >>>
> >>>
> >>> Yes, threads can be interrupted if they are holding a bundle lock and
> the
> >>> global lock holder needs the bundle lock.
> >>>
> >>> I admit that I do not recall why we ignore the interrupt here, but
> didn't
> >>> we
> >>> implement service lookup so that a bundle lock wasn't necessary? I
> >>> thought
> >>> we just checked for the validity of the bundle context before returning
> >>> or
> >>> something. Perhaps we felt there was no reason to be interrupted in
> that
> >>> case. I really don't know.
> >>
> >> I think that the Service Registry could be rewritten to be completely
> >> free of synchronized blocks using the Java 5 concurrency libraries,
> >
> >
> > Well, that just moves the sync blocks to the library, but yeah sure.
> >
> >> which I think would really be a better approach. There is too much
> >> locking going on in the current SR implementation IMHO.
> >
> >
> > I don't really think there is too much, but it is complicated.
> > Unfortunately, it is complicated to make sure that locks aren't held
> while
> > do service lookups and this is complicated because you can run into
> cycles,
> > etc.
> >
> > But feel free to try to simplify it.
> >
> >>
> >> This brings the question: can we move to Java 5 (or Java 6) for the
> >> Framework codebase? AFAIK we're currently still JDK 1.4 compatible but
> >> I would be surprised if there is anyone who still needs a JDK that
> >> went end-of-life 7 years ago.
> >
> >
> > At this point, it doesn't really matter to me.
> >
> > -> richard
> >
> >>
> >> Best regards,
> >>
> >> David
> >
> >
>