You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by Daniel McGreal <da...@redbite.com> on 2013/12/29 16:57:55 UTC

Re: Felix and JamVM deadlock in ClassLoader

Hi again,
Shifting other project concerns to the top of the pile has meant that I have managed delay fixing this problem, which I’m beginning to understand.

There are two 
            synchronized (m_classLocks)
blocks in BundleWiringImpl.java at BundleClassLoader.loadClass(String,Boolean) and at findClass(String).

The synchronised block in loadClass calls findLoadedClass(name) which, in JamVM, is itself a synchronised method and blocks thread “A” inside, at the first line. There isn’t any further synchronicity inside this call.
Thread “B" blocks before entering the synchronised block in findClass.
There don’t appear to be any further threads involved, unless they’re hiding.

Can anyone suggest why thread “A” is unable to execute the synchronized findLoadedClass? Equinox and Knopflerfish do not suffer from this problem and do not seem to call findLoadedClass directly (though they do call java.lang.ClassLoader.findClass, which in JamVM calls findLoadedClass first).

I can guarantee not reaching deadlocking conditions if I make loadClass and findClass synchronised. Are there any potentially harmful side effects from doing this?

Many thanks for any help that can be offered me,
Dan.

On 26 Nov 2013, at 01:40, Daniel McGreal <da...@redbite.com> wrote:

> I removed all the cyclics that my tool was able to detect. All of which were Felix and KF jar importing packages that they themselves exported. 
> 
> Unfortunately, this hasn't had an effect on the blocking. So it's either not a cyclic dependency issue, or my tool is inadequate.
> 
> Specifically I removed "self-imports" from:
> Gogo runtime & command,
> Felix's SCR,
> Knopflerfish's prefs-all, log-api,
> 
> Many thanks for any further help you can provide,
> Dan.
> 
> On 26 Nov 2013, at 00:28, Daniel McGreal wrote:
> 
>> Hi Karl et al,
>> I've written a script which detects cyclic dependencies from the deployed jars manifests.
>> I'm finding rather a lot in Felix and Knopflerfish jars. For example, seemingly the Gogo bundles all seem to like to import packages that they provide - is this likely to be a problem, or is the fact that they're from the same bundle get recognised in Felix?
>> Best, Dan.
>> 
>> On 25 Nov 2013, at 22:01, Karl Pauls wrote:
>> 
>>> Are you sure you don't have any cyclic dependencies between your bundles
>>> anymore (maybe you could send me your manifests too)?
>>> 
>>> regards,
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Nov 25, 2013 at 10:10 PM, Daniel McGreal <da...@redbite.com> wrote:
>>> 
>>>> Hi Felix people,
>>>> 
>>>> I've come back to revisit this issue, where threads are deadblocking
>>>> during classloading when running under JamVM and Felix 4.3.1. Having now
>>>> completed other deployments, it's now critical for me to get this working
>>>> ASAP, therefore any help you can offer is tremendously gratefully received.
>>>> 
>>>> I've removed a cyclic dependency, as Karl suggested, which I believe
>>>> reduced the occurrence rate. I've also started to test without
>>>> Retrotranslation, which hasn't had an effect.
>>>> 
>>>> A common element to the deadlock is SCR:
>>>> 
>>>> "FelixStartLevel" (daemon) 0x1aaf9a0 priority: 5 tid: 0x7f69ce3a8700 id: 6
>>>> state: BLOCKED (5)
>>>> at java.lang.ClassLoader.findLoadedClass(ClassLoader.java:580)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1948)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:293)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl.getClassByDelegation(BundleWiringImpl.java:1374)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl.searchImports(BundleWiringImpl.java:1553)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1484)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:75)
>>>> at
>>>> org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1955)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:293)
>>>> at java.lang.VMClass.getDeclaredMethods(Native method)
>>>> at java.lang.Class.getDeclaredMethods(Class.java:536)
>>>> at java.lang.Class.getDeclaredMethod(Class.java:506)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod.getMethod(BaseMethod.java:328)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BindMethod.getMethod(BindMethod.java:38)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BindMethod.getServiceReferenceMethod(BindMethod.java:350)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BindMethod.doFindMethod(BindMethod.java:99)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod.findMethod(BaseMethod.java:185)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod.access$400(BaseMethod.java:37)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod$NotResolved.resolve(BaseMethod.java:555)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod$NotResolved.methodExists(BaseMethod.java:578)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BaseMethod.methodExists(BaseMethod.java:510)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BindMethod.methodExists(BindMethod.java:38)
>>>> at
>>>> org.apache.felix.scr.impl.helper.BindMethod.getServiceObject(BindMethod.java:560)
>>>> at
>>>> org.apache.felix.scr.impl.manager.DependencyManager.prebind(DependencyManager.java:1091)
>>>> at
>>>> org.apache.felix.scr.impl.manager.AbstractComponentManager.collectDependencies(AbstractComponentManager.java:768)
>>>> at
>>>> org.apache.felix.scr.impl.manager.AbstractComponentManager$Unsatisfied.activate(AbstractComponentManager.java:1492)
>>>> at
>>>> org.apache.felix.scr.impl.manager.AbstractComponentManager.activateInternal(AbstractComponentManager.java:550)
>>>> at
>>>> org.apache.felix.scr.impl.manager.AbstractComponentManager.enable(AbstractComponentManager.java:261)
>>>> at
>>>> org.apache.felix.scr.impl.config.ImmediateComponentHolder.enableComponents(ImmediateComponentHolder.java:328)
>>>> at
>>>> org.apache.felix.scr.impl.BundleComponentActivator.initialize(BundleComponentActivator.java:158)
>>>> at
>>>> org.apache.felix.scr.impl.BundleComponentActivator.<init>(BundleComponentActivator.java:113)
>>>> at org.apache.felix.scr.impl.Activator.loadComponents(Activator.java:261)
>>>> at
>>>> org.apache.felix.scr.impl.Activator.loadAllComponents(Activator.java:199)
>>>> at org.apache.felix.scr.impl.Activator.start(Activator.java:108)
>>>> at
>>>> org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:645)
>>>> at org.apache.felix.framework.Felix.activateBundle(Felix.java:2146)
>>>> at org.apache.felix.framework.Felix.startBundle(Felix.java:2064)
>>>> at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1291)
>>>> at
>>>> org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>> 
>>>> After taking this trace, I changed SCR to start before the bundles it
>>>> should load, the deadlock still happens with a similar stack but now from
>>>> org.apache.felix.scr.impl.Activator.bundleChanged. In fact, having tried to
>>>> avoid the issue by using start levels and loading bundles in a natural
>>>> order (previously everything was getting started via auto-deploy), the
>>>> deadlock occurs every time (whereas it was down to more like 1 in 20 with
>>>> auto-deploy).
>>>> 
>>>> A couple of examples of the other blocked threads are attached.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> In every case, FelixStartLevel blocks at ClassLoader.java:580 (via SCR
>>>> and BundleWiringImpl.java:1948 loadClass) and BundleWiringImpl.java:2038
>>>> findClass (though that line seems inconspicuous - if (!hooks.isEmpty()),
>>>> I don't readily see how a block could happen there - hooks is a
>>>> TreeMap?). I added some debug output to BundleWiringImpl.java findClass and
>>>> loadClass methods (no debugger with JamVM):
>>>> 
>>>> 
>>>>      @Override
>>>>      protected Class loadClass(final String name, final booleanresolve)
>>>>          throws ClassNotFoundException
>>>>      {
>>>>      System.out.println(getBundle().getSymbolicName() +" loadClass "+
>>>> name +", "+ resolve);
>>>> 
>>>> ------
>>>> 
>>>> 
>>>>      @Override
>>>>      protected Class findClass(final String name) throwsClassNotFoundException
>>>>      {
>>>>      System.out.println(getBundle().getSymbolicName() +" findClass "+
>>>> name);
>>>> 
>>>> 
>>>> 
>>>> The output I get, leads me to believe that the deadlock is not particular
>>>> to a bundle or class. For example (the last findClass and loadClass
>>>> highlighted):
>>>> 
>>>> ...
>>>> org.apache.felix.gogo.command *findClass* org.osgi.service.log.LogEntry
>>>> obs.logging.amqp loadClass
>>>> com.redbite.rededge.obs.interfaces.deployment.DeviceRegistry, false
>>>> obs.interfaces.deployment loadClass
>>>> com.redbite.rededge.obs.interfaces.deployment.DeviceRegistry, false
>>>> obs.logging.amqp loadClass com.rabbitmq.client.Channel, false
>>>> com.redbite.osgi.com.rabbitmq.amqp-client *loadClass*com.rabbitmq.client.Channel, false
>>>> 
>>>> 
>>>> Do you think I am interpreting this correctly? Now though, as well as
>>>> blocking on [what used to be] line 2038 (if (!hooks.isEmpty())), it
>>>> actually now  blocks on the line I've added, System.out.println(getBundle().getSymbolicName()
>>>> +" findClass "+ name);
>>>> 
>>>> Any ideas on what I can do to pursue this further, gratefully received.
>>>> Best, Dan.
>>>> 
>>>> On 12 Nov 2013, at 11:18, Karl Pauls wrote:
>>>> 
>>>> On the retrotranslator bundles, they are made available by specifying a
>>>> 
>>>> dynamic-import which is added to the manifest during retrotranslation (so
>>>> 
>>>> that our code/libraries can run on).
>>>> 
>>>> 
>>>> 
>>>> I would try to change that and either have that bundles each contain their
>>>> own copy or to not have them mention it at all in the manifest and just put
>>>> them on the class path of the framework and boot delegate.
>>>> 
>>>> 
>>>> Unfortunately, running unretrotranslated does not help.
>>>> 
>>>> 
>>>> 
>>>> Hm, in that case something else must be going on - however, in your stack
>>>> trace it clearly all happens inside the retrotranslate part so you might
>>>> have to give me a stack trace of the unretrotranslated one as well.
>>>> 
>>>> regards,
>>>> 
>>>> Karl
>>>> 
>>>> p.s.: feel free to also share your manifests if that is possible
>>>> 
>>>> 
>>>> Thanks a lot for your help,
>>>> 
>>>> Dan.
>>>> 
>>>> 
>>>> On 12 Nov 2013, at 09:51, Karl Pauls wrote:
>>>> 
>>>> 
>>>> Well, my guess would be that you are running into some form of a circular
>>>> 
>>>> class load issue. The point is that the class loader in java is
>>>> 
>>>> synchronized which my lead to deadlocks if you have a cycle in your class
>>>> 
>>>> loaders. This is a long standing issue with frameworks like OSGi which
>>>> 
>>>> allow for cycles in the dependencies.
>>>> 
>>>> 
>>>> I don't know enough about your app to say for sure nor to help you with
>>>> 
>>>> this but look at the dependencies of your bundles and see whether they
>>>> 
>>>> create a cycle (i.e., Bundle A depends via some import on Bundle B which
>>>> 
>>>> depends via some import on Bundle C which depends via some import on
>>>> 
>>>> Bundle
>>>> 
>>>> A or something similar). In your case, it might be even more complicated
>>>> 
>>>> as
>>>> 
>>>> the retrotranslator seems to do some class loading magic by itself which
>>>> 
>>>> might be responsible for the issue - how do you make the
>>>> 
>>>> net.sf.retrotranslator.*
>>>> 
>>>> packages available to the bundles?
>>>> 
>>>> 
>>>> regards,
>>>> 
>>>> 
>>>> Karl
>>>> 
>>>> 
>>>> 
>>>> On Fri, Nov 8, 2013 at 10:44 PM, Daniel McGreal <da...@redbite.com>
>>>> 
>>>> wrote:
>>>> 
>>>> 
>>>> Hello Felix developer community,
>>>> 
>>>> 
>>>> I am having an interesting (and critical) problem running my application
>>>> 
>>>> with Felix in an embedded environment where an alternate JVM, JamVM is
>>>> 
>>>> necessary.
>>>> 
>>>> 
>>>> The deadlock is non-deterministic with the deadlocked stack traces
>>>> 
>>>> blocking at odd places (though they might not seem odd to people used to
>>>> 
>>>> messing with class loading - but a block on new XXX() to me is quite the
>>>> 
>>>> novelty). An example of the relevant parts of the thread stack is in
>>>> 
>>>> FelixJamVMDeadlockStackTrace.txt, attached. It shows a lock between:
>>>> 
>>>> 
>>>> 
>>>> -
>>>> 
>>>> - java.lang.ClassLoader.findLoadedClass(ClassLoader.java:580) looks
>>>> 
>>>> like:
>>>> 
>>>> 
>>>> protected final synchronized Class<?> findLoadedClass(String name)
>>>> 
>>>> {
>>>> 
>>>>   checkInitialized();
>>>> 
>>>>   return VMClassLoader.findLoadedClass(this, name);
>>>> 
>>>> }
>>>> 
>>>> 
>>>> where line 580 is checkInitialized() which throws an exception if
>>>> 
>>>> isInitialized is false.
>>>> 
>>>> 
>>>> -
>>>> 
>>>> 
>>>> net.sf.retrotranslator.runtime.asm.ClassReader.readConst(ClassReader.java:1591)
>>>> 
>>>> looks like:
>>>> 
>>>> 
>>>>    public Object readConst(final int item, final char[] buf) {
>>>> 
>>>>       int index = items[item];
>>>> 
>>>>       switch (b[index - 1]) {
>>>> 
>>>>            case ClassWriter.INT:
>>>> 
>>>>               return new Integer(readInt(index));
>>>> 
>>>>           case ClassWriter.FLOAT:
>>>> 
>>>>                return new
>>>> 
>>>> Float(Float.intBitsToFloat(readInt(index)));
>>>> 
>>>>            case ClassWriter.LONG:
>>>> 
>>>>               return new Long(readLong(index));
>>>> 
>>>>           case ClassWriter.DOUBLE:
>>>> 
>>>>                return
>>>> 
>>>> newDouble(Double.longBitsToDouble(readLong(index)));
>>>> 
>>>>            case ClassWriter.CLASS:
>>>> 
>>>>               String s = readUTF8(index, buf);
>>>> 
>>>>               return Type.getType(s.charAt(0) == '[' ? s : "L" + s +
>>>> 
>>>> ";");
>>>> 
>>>>           // case ClassWriter.STR:
>>>> 
>>>>           default:
>>>> 
>>>>                return readUTF8(index, buf);
>>>> 
>>>>       }
>>>> 
>>>>    }
>>>> 
>>>> where line 1591 is return new Integer(readInt(index));
>>>> 
>>>> 
>>>> -
>>>> 
>>>> -
>>>> 
>>>> -
>>>> 
>>>> 
>>>> org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.findClass(BundleWiringImpl.java:2038)
>>>> 
>>>> is a big method, where line 2038 looks inconspicuously like:
>>>> 
>>>> 
>>>> if (!hooks.isEmpty())
>>>> 
>>>> 
>>>> 
>>>> Gogo and Retrotranslator are red herrings, without them in the
>>>> 
>>>> application
>>>> 
>>>> the deadlock just happens somewhere else.
>>>> 
>>>> 
>>>> I don't even vaguely understand how, but neither Eclipse nor
>>>> 
>>>> Knopflerfish
>>>> 
>>>> suffer from this problem. I would, however, much rather use Felix.
>>>> 
>>>> 
>>>> Please let me know if there's any further information I can provide.
>>>> 
>>>> 
>>>> Best, Dan.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> 
>>>> Karl Pauls
>>>> 
>>>> karlpauls@gmail.com
>>>> 
>>>> http://twitter.com/karlpauls
>>>> 
>>>> http://www.linkedin.com/in/karlpauls
>>>> 
>>>> https://profiles.google.com/karlpauls
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Karl Pauls
>>>> karlpauls@gmail.com
>>>> http://twitter.com/karlpauls
>>>> http://www.linkedin.com/in/karlpauls
>>>> https://profiles.google.com/karlpauls
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> -- 
>>> Karl Pauls
>>> karlpauls@gmail.com
>>> http://twitter.com/karlpauls
>>> http://www.linkedin.com/in/karlpauls
>>> https://profiles.google.com/karlpauls
>> 
>