You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@karaf.apache.org by "Jean-Baptiste Onofré (Jira)" <ji...@apache.org> on 2020/01/13 17:39:00 UTC

[jira] [Resolved] (KARAF-2256) Deadlock when refreshing bundles

     [ https://issues.apache.org/jira/browse/KARAF-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Baptiste Onofré resolved KARAF-2256.
-----------------------------------------
    Resolution: Cannot Reproduce

It should not be an issue in 4.2.x anymore.

> Deadlock when refreshing bundles
> --------------------------------
>
>                 Key: KARAF-2256
>                 URL: https://issues.apache.org/jira/browse/KARAF-2256
>             Project: Karaf
>          Issue Type: Bug
>    Affects Versions: 2.3.1
>         Environment: 64-bit Linux Oracle JDK 1.7.0_17
>            Reporter: Amichai Rothman
>            Assignee: Achim Nierbeck
>            Priority: Major
>
> When attempting to install the DOSGi feature (by running "features:chooseurl cxf-dosgi 1.4.0" and "features:install cxf-dosgi-discovery-distributed"), the installation hangs along with some of the bundles which can no longer be started, stopped, checked for imports, etc. - the Karaf server must be killed and restarted to resume. This is likely not related to this specific feature, and can happen with other refreshed bundles and installed features as well.
> At a glance it seems like this is caused by the "OPS4J Pax Web - Runtime (1.1.12)" bundle being stuck in the stopping state due to a deadlock caused by its Activator:
> It receives a removedService notification from a service tracker, which is handled in a separate thread using a custom executor and eventually tries to resolve some bundle and ends up waiting for acquireGlobalLock indefinitely.
> This is because at the same time, Felix calls refreshPackages which attempts to stop the bundle (while holding the lock), whose activator puts a cleanup task in its custom executor and then attempts to shut down the executor. This never happens, because the previous executor task initiated from removeService is waiting for the lock, hence the deadlock.
> I'm not entirely sure which of the projects has the underlying bug in it - probably pax web, possibly Felix if the OSGi specs allow for the behavior that hangs it, but in any case Karaf is using these versions and exhibiting the deadlock, so at the very least should upgrade to fixed versions of these libraries, or patch them.
> If anyone who knows these systems better thinks it should be reported in one of the upstream projects, point me in the right direction and I'll be happy to do it.
> Here is the thread dump, the top two threads show the deadlock, and the other two are bundles which are stuck as well due to waiting for the same lock (I think).
> "FelixFrameworkWiring" daemon prio=10 tid=0x00007f390002e000 nid=0x35d1 in Object.wait() [0x00007f3948dd3000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000f1b8ea50> (a java.lang.Object)
>         at java.lang.Object.wait(Object.java:503)
>         at org.ops4j.pax.web.service.internal.Executor.shutdown(Executor.java:91)
>         - locked <0x00000000f1b8ea50> (a java.lang.Object)
>         at org.ops4j.pax.web.service.internal.Activator.stop(Activator.java:140)
>         at org.apache.felix.framework.util.SecureAction.stopActivator(SecureAction.java:667)
>         at org.apache.felix.framework.Felix.stopBundle(Felix.java:2361)
>         at org.apache.felix.framework.Felix$RefreshHelper.stop(Felix.java:4629)
>         at org.apache.felix.framework.Felix.refreshPackages(Felix.java:3951)
>         at org.apache.felix.framework.FrameworkWiringImpl.run(FrameworkWiringImpl.java:172)
>         at java.lang.Thread.run(Thread.java:722)
> "Pax Web Runtime worker" daemon prio=10 tid=0x00007f3904263000 nid=0x370a in Object.wait() [0x00007f390dfa8000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireGlobalLock(Felix.java:4944)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.StatefulResolver.resolve(StatefulResolver.java:219)
>         at org.apache.felix.framework.BundleWiringImpl.searchDynamicImports(BundleWiringImpl.java:1539)
>         at org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1439)
>         at org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:72)
>         at org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:1843)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>         at org.apache.felix.framework.BundleWiringImpl.getClassByDelegation(BundleWiringImpl.java:1317)
>         at org.apache.felix.framework.ServiceRegistrationImpl$ServiceReferenceImpl.isAssignableTo(ServiceRegistrationImpl.java:521)
>         at org.apache.felix.framework.util.Util.isServiceAssignable(Util.java:280)
>         at org.apache.felix.framework.util.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:916)
>         at org.apache.felix.framework.util.EventDispatcher.fireEventImmediately(EventDispatcher.java:793)
>         at org.apache.felix.framework.util.EventDispatcher.fireServiceEvent(EventDispatcher.java:543)
>         at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4260)
>         at org.apache.felix.framework.Felix.access$000(Felix.java:74)
>         at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:390)
>         at org.apache.felix.framework.ServiceRegistry.unregisterService(ServiceRegistry.java:148)
>         at org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:127)
>         at org.ops4j.pax.web.service.internal.Activator.updateController(Activator.java:231)
>         at org.ops4j.pax.web.service.internal.Activator$DynamicsServiceTrackerCustomizer$2.run(Activator.java:387)
>         at org.ops4j.pax.web.service.internal.Executor$Future.run(Executor.java:45)
>         at org.ops4j.pax.web.service.internal.Executor$Worker.run(Executor.java:122)
> "fileinstall-/opt/apache-karaf-2.3.1/deploy" daemon prio=10 tid=0x00007f3904018800 nid=0x35a8 in Object.wait() [0x00007f394aba8000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4871)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.Felix.startBundle(Felix.java:1744)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:944)
>         at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundle(DirectoryWatcher.java:1247)
>         at org.apache.felix.fileinstall.internal.DirectoryWatcher.startBundles(DirectoryWatcher.java:1219)
>         at org.apache.felix.fileinstall.internal.DirectoryWatcher.startAllBundles(DirectoryWatcher.java:1208)
>         at org.apache.felix.fileinstall.internal.DirectoryWatcher.process(DirectoryWatcher.java:503)
>         at org.apache.felix.fileinstall.internal.DirectoryWatcher.run(DirectoryWatcher.java:291)
> "NioProcessor-2" prio=10 tid=0x00007f3914014000 nid=0x35fd in Object.wait() [0x00007f394a064000]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at java.lang.Object.wait(Object.java:503)
>         at org.apache.felix.framework.Felix.acquireBundleLock(Felix.java:4871)
>         - locked <0x00000000e990e018> (a [Ljava.lang.Object;)
>         at org.apache.felix.framework.Felix.startBundle(Felix.java:1744)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:944)
>         at org.apache.felix.framework.BundleImpl.start(BundleImpl.java:931)
>         at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeatures(FeaturesServiceImpl.java:479)
>         at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeature(FeaturesServiceImpl.java:396)
>         at org.apache.karaf.features.internal.FeaturesServiceImpl.installFeature(FeaturesServiceImpl.java:392)
>         at org.apache.karaf.features.command.InstallFeatureCommand.doExecute(InstallFeatureCommand.java:62)
>         at org.apache.karaf.features.command.FeaturesCommandSupport.doExecute(FeaturesCommandSupport.java:41)
>         at org.apache.karaf.shell.console.OsgiCommandSupport.execute(OsgiCommandSupport.java:38)
>         at org.apache.felix.gogo.commands.basic.AbstractCommand.execute(AbstractCommand.java:35)
>         at org.apache.felix.gogo.runtime.CommandProxy.execute(CommandProxy.java:78)
>         at org.apache.felix.gogo.runtime.Closure.executeCmd(Closure.java:474)
>         at org.apache.felix.gogo.runtime.Closure.executeStatement(Closure.java:400)
>         at org.apache.felix.gogo.runtime.Pipe.run(Pipe.java:108)
>         at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:183)
>         at org.apache.felix.gogo.runtime.Closure.execute(Closure.java:120)
>         at org.apache.felix.gogo.runtime.CommandSessionImpl.execute(CommandSessionImpl.java:89)
>         at org.apache.karaf.shell.ssh.ShellCommandFactory$ShellCommand$1.run(ShellCommandFactory.java:109)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.karaf.shell.ssh.ShellCommandFactory$ShellCommand.start(ShellCommandFactory.java:107)
>         at org.apache.sshd.server.channel.ChannelSession.handleExec(ChannelSession.java:388)
>         at org.apache.sshd.server.channel.ChannelSession.handleRequest(ChannelSession.java:235)
>         at org.apache.sshd.server.channel.ChannelSession.handleRequest(ChannelSession.java:195)
>         at org.apache.sshd.common.session.AbstractSession.channelRequest(AbstractSession.java:1057)
>         at org.apache.sshd.server.session.ServerSession.running(ServerSession.java:229)
>         at org.apache.sshd.server.session.ServerSession.handleMessage(ServerSession.java:205)
>         at org.apache.sshd.common.session.AbstractSession.decode(AbstractSession.java:566)
>         at org.apache.sshd.common.session.AbstractSession.messageReceived(AbstractSession.java:236)
>         - locked <0x00000000efd56b00> (a java.lang.Object)
>         at org.apache.sshd.common.AbstractSessionIoHandler.messageReceived(AbstractSessionIoHandler.java:58)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:690)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilterChain.java:47)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:765)
>         at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:109)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:417)
>         at org.apache.mina.core.filterchain.DefaultIoFilterChain.fireMessageReceived(DefaultIoFilterChain.java:410)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.read(AbstractPollingIoProcessor.java:710)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:664)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.process(AbstractPollingIoProcessor.java:653)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.access$600(AbstractPollingIoProcessor.java:67)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1124)
>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)