You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Claus Ibsen (Jira)" <ji...@apache.org> on 2022/01/23 08:46:00 UTC

[jira] [Resolved] (CAMEL-17536) ServicePool.doStop hangs during shutdown

     [ https://issues.apache.org/jira/browse/CAMEL-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Claus Ibsen resolved CAMEL-17536.
---------------------------------
      Assignee: Claus Ibsen
    Resolution: Fixed

> ServicePool.doStop hangs during shutdown
> ----------------------------------------
>
>                 Key: CAMEL-17536
>                 URL: https://issues.apache.org/jira/browse/CAMEL-17536
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 3.14.0
>            Reporter: Krzysztof Jamróz
>            Assignee: Claus Ibsen
>            Priority: Major
>             Fix For: 3.14.2, 3.15.0
>
>         Attachments: linkedhashmap corrupted.png
>
>
> Camel sometimes hangs (100% usage of 1 CPU) during shutdown. This occurs intermittently during integration testing. We use {{recipientList}} to perform some dynamic routing and many messages can be processed in parallel threads by {{recipientList}} and there are a few of them in different routes (this will be important in a moment).
> Stack trace of stuck thread:
> {noformat}
> "SpringApplicationShutdownHook" #18 prio=5 os_prio=0 cpu=1784989.37ms elapsed=1790.21s tid=0x00007f7a13520000 nid=0xa80 runnable  [0x00007f79a3af5000]
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.camel.support.cache.ServicePool.stop(ServicePool.java:193)
>     at org.apache.camel.support.cache.ServicePool$$Lambda$2274/0x000000084108fc40.accept(Unknown Source)
>     at java.util.LinkedHashMap$LinkedValues.forEach(java.base@11.0.13/LinkedHashMap.java:608)
>     at org.apache.camel.support.cache.ServicePool.doStop(ServicePool.java:181)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e3ece780> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.cache.DefaultProducerCache.doStop(DefaultProducerCache.java:399)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e3ece740> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:147)
>     at org.apache.camel.processor.RecipientList.doStop(RecipientList.java:222)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e3ecdb00> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:165)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:147)
>     at org.apache.camel.processor.Pipeline.doStop(Pipeline.java:226)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e3ecda48> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:147)
>     at org.apache.camel.impl.engine.DefaultChannel.doStop(DefaultChannel.java:134)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e3eccde0> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.service.ServiceHelper.stopAndShutdownServices(ServiceHelper.java:257)
>     at org.apache.camel.support.service.ServiceHelper.stopAndShutdownServices(ServiceHelper.java:215)
>     at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler.doShutdown(RedeliveryErrorHandler.java:1665)
>     at org.apache.camel.support.ChildServiceSupport.shutdown(ChildServiceSupport.java:113)
>     - locked <0x00000000e3ecc1e8> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopAndShutdownService(ServiceHelper.java:233)
>     at org.apache.camel.impl.engine.RouteService.stopChildServices(RouteService.java:409)
>     at org.apache.camel.impl.engine.RouteService.doStop(RouteService.java:260)
>     at org.apache.camel.support.ChildServiceSupport.stop(ChildServiceSupport.java:86)
>     - locked <0x00000000e40873a0> (a java.lang.Object)
>     at org.apache.camel.support.service.ServiceHelper.stopService(ServiceHelper.java:162)
>     at org.apache.camel.support.service.ServiceHelper.stopAndShutdownService(ServiceHelper.java:227)
>     at org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3559)
>     at org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3584)
>     at org.apache.camel.impl.engine.AbstractCamelContext.doStop(AbstractCamelContext.java:3366)
>     at org.apache.camel.spring.boot.SpringBootCamelContext.doStop(SpringBootCamelContext.java:61)
>     - locked <0x00000000e221e658> (a org.apache.camel.spring.boot.SpringBootCamelContext)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     - locked <0x00000000e221e8a8> (a java.lang.Object)
>     at org.apache.camel.impl.engine.AbstractCamelContext.stop(AbstractCamelContext.java:2640)
>     at org.apache.camel.spring.SpringCamelContext.stop(SpringCamelContext.java:128)
>     at org.springframework.context.support.DefaultLifecycleProcessor.doStop(DefaultLifecycleProcessor.java:247)
>     at org.springframework.context.support.DefaultLifecycleProcessor.access$300(DefaultLifecycleProcessor.java:54)
>     at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.stop(DefaultLifecycleProcessor.java:373)
>     at org.springframework.context.support.DefaultLifecycleProcessor.stopBeans(DefaultLifecycleProcessor.java:206)
>     at org.springframework.context.support.DefaultLifecycleProcessor.onClose(DefaultLifecycleProcessor.java:129)
>     at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1067)
>     at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1021)
>     - locked <0x00000000e11e3868> (a java.lang.Object)
>     at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:137)
>     at org.springframework.boot.SpringApplicationShutdownHook$$Lambda$2265/0x0000000841088440.accept(Unknown Source)
>     at java.lang.Iterable.forEach(java.base@11.0.13/Iterable.java:75)
>     at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:106)
>     at java.lang.Thread.run(java.base@11.0.13/Thread.java:829){noformat}
> Deeper inspection of heap dump reveals that linked list in {{ServicePool.cache}} is corrupted and has a loop, which causes {{foreach}} to loop infinitely (see attached screenshot from heapdump analysis).
> Access to {{ServicePool.cache}} is not synchronized even though {{LinkedHashMap}} is not thread-safe. There is at least one code path that can add to the cache it concurrently (and indeed in this seems to happen in our tests):
>  # org.apache.camel.processor.MulticastProcessor.process(Exchange, AsyncCallback)
>  # org.apache.camel.processor.RecipientListProcessor.createProcessorExchangePairs(Exchange)
>  # org.apache.camel.processor.RecipientListProcessor.doCreateProcessorExchangePairs(Exchange, Object, List<ProcessorExchangePair>, int)
>  # org.apache.camel.support.cache.DefaultProducerCache.acquireProducer(Endpoint)
>  # org.apache.camel.support.cache.ServicePool.acquire(Endpoint)
>  # org.apache.camel.support.cache.ServicePool.cache
> There are also a few unsynchronized invocations of {{DefaultProducerCache.acquireProducer}} from other processors.
> I am not sure if other usages of {{cache}} need to be synchronized, maybe Camel lifecycle provides enough serialization.
>  
> The simpliest solution is to wrap {{cache}} in {{Collections.synchronizedMap}} but this might have negative impact on performance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)