You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@camel.apache.org by "Krzysztof Jamróz (Jira)" <ji...@apache.org> on 2022/07/11 11:33:00 UTC

[jira] [Updated] (CAMEL-17544) ServicePool.doStop still hangs during shutdown

     [ https://issues.apache.org/jira/browse/CAMEL-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krzysztof Jamróz updated CAMEL-17544:
-------------------------------------
          Component/s: came-core
    Affects Version/s: 3.18.0

I just tested and the same problem still occurs in 3.18.0. As I have written before, {{SimpleLRUCache}} is not thread-safe but is sometimes used from multiple threads. This can be a source of hard to reproduce errors.

I think this should either be fixed (use thread safe LRU cache) or documented as a (IMO risky) optimization. In latter case information should be provided what should be avoided in default configuration, eg. that you should not use dynamic recipient list in multiple threads when not using {{caffeine-lrucache}} (and some other cases).

> ServicePool.doStop still hangs during shutdown
> ----------------------------------------------
>
>                 Key: CAMEL-17544
>                 URL: https://issues.apache.org/jira/browse/CAMEL-17544
>             Project: Camel
>          Issue Type: Bug
>          Components: came-core
>    Affects Versions: 3.14.0, 3.18.0
>            Reporter: Krzysztof Jamróz
>            Priority: Major
>         Attachments: ServicePoolShutdownTest.java
>
>
> {{ServicePool.doStop}} still hangs during shutdown with optimized fix of CAMEL-17536. {{LinkedHashMap}} in cache is corrupted not only because of race condtion between {{acquire}} and {{doStop}} but also between concurrent invocations of {{acquire}} during parallel routing of exchanges (see attached {{{}repeatTestRecipientList{}}}).
>  
> The root cause is that {{LinkedHashMap}} used by {{SimpleLRUCache}} is *not* thread-safe but access is not synchronized. Even reads may modify it as it has LRU policy. And access to {{{}LinkedHashMap{}}}/\{{ SimpleLRUCache }} is possible during routing (so concurrently) not only during start/stop.
>  
> Uses of {{SimpleLRUCache}} in other places in Camel may also exhibit race conditions. It is hard to demonstrate them reliably but I have another example ({{{}repeatEndpointsTest{}}}). This one usually crashes due to {{OutOfMemory}} when converting corrupted (looping) {{SimpleLRUCache}} in  {{EndpointRegistry}} to array, but I got also other exceptions.
>  
> {noformat}
> java.lang.OutOfMemoryError: Java heap space
>     at java.base/java.util.Arrays.copyOf(Arrays.java:3480)
>     at java.base/java.util.AbstractCollection.finishToArray(AbstractCollection.java:227)
>     at java.base/java.util.AbstractCollection.toArray(AbstractCollection.java:148)
>     at java.base/java.util.ArrayList.<init>(ArrayList.java:181)
>     at org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3582)
>     at org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3576)
>     at org.apache.camel.impl.engine.AbstractCamelContext.doStop(AbstractCamelContext.java:3414)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     at org.apache.camel.impl.engine.AbstractCamelContext.stop(AbstractCamelContext.java:2658)
>     at org.apache.camel.component.http.ServicePoolShutdownTest.testEndpoints(ServicePoolShutdownTest.java:129)
>     at org.apache.camel.component.http.ServicePoolShutdownTest.repeatEndpointsTest(ServicePoolShutdownTest.java:135){noformat}
>  
> Caffeine, which used to be default cache implementation until CAMEL-16093 is thread-safe. Enabling caffeine-lrucache might be a workaround.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)