You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by Gerda Ilger <ge...@ecosio.com> on 2022/09/01 16:21:13 UTC

Deadlock when concurrently calling addRoutes/removeRoute

Dear Camel Community,

We ran into a deadlock when concurrently adding and removing routes.

We're dynamically adding and removing routes, whose (ftp) consumers
sometimes take a longer time to start up, and observed the following:

* (Default/Abstract)CamelContext.addRoutes ->
DefaultCamelContext.addRouteDefinitions locks DefaultModel first, and
is then blocked trying to lock DefaultCamelContext
* (Default/Abstract)CamelContext.removeRoute locks DefaultCamelContext
first, and is then blocked trying to lock DefaultModel

---

Found one Java-level deadlock:
=============================

"pool-2-thread-3":
waiting to lock monitor 0x00007fd21404a130 (object 0x000000068c002a20,
a org.apache.camel.impl.DefaultCamelContext),
which is held by "pool-2-thread-2"

"pool-2-thread-2":
waiting to lock monitor 0x00007fd22000b0c0 (object 0x000000068c800170,
a org.apache.camel.impl.DefaultModel),
which is held by "pool-2-thread-3"

---


We are on Camel 3.7, but I was able to reproduce it with
3.19.0-SNAPSHOT in a JUnit test.

I haven't found any tickets or mailing list discussions about this behaviour.
Is this a known issue?


Thanks,
Gerda

Re: Deadlock when concurrently calling addRoutes/removeRoute

Posted by Claus Ibsen <cl...@gmail.com>.
Hi Gerda

Thanks for sharing your solution.

I guess users like you whom dynamic add and remove routes more frequently
could be affected by this deadlock when doing concurrent add and removes.

On top of my head one aspect that could lead to a problem is that if your
routes are using the same route ids, so you are doing a remove of "foo" and
add "foo" at the same time.
However that is needed to be investigated deep dive to really be sure.

I think your solution is something we should consider for merging to the
core project. You are welcome to create a JIRA and send a PR with the code
changes.
Thank you.

On Fri, Sep 2, 2022 at 8:50 AM Gerda Ilger <ge...@ecosio.com> wrote:

> Hi Claus,
>
> Yes, we're now doing our own locking - essentially just straightening
> out the order of locks Camel will need later on in our code.
> It feels wrong, though: CamelContext is already putting a lot of
> effort into protecting addRoutes/removeRoute-calls.
> Streamlining the order in which locks are obtained would fix this very
> deadlock - and would probably be beneficial in other places too?
>
> I played around with it a bit: it felt less invasive to lock the model
> for the "removeRoute" call than locking CamelContext for the whole
> "addRoutes"-call (which is what we're doing now in our code, works as
> well), but I know I don't have the big picture.
> https://github.com/apache/camel/compare/main...gerdailger:camel:deadlock
>
> Either way, let me quickly thank you all for Camel at this point. It
> has been working very well for us for years.
>
> Kind regards
> Gerda
>
> Am Do., 1. Sept. 2022 um 18:36 Uhr schrieb Claus Ibsen <
> claus.ibsen@gmail.com>:
> >
> > Hi
> >
> > Do you own locking if you do concurrent model changes to Camel
> >
> > On Thu, Sep 1, 2022 at 6:21 PM Gerda Ilger <ge...@ecosio.com>
> wrote:
> >
> > > Dear Camel Community,
> > >
> > > We ran into a deadlock when concurrently adding and removing routes.
> > >
> > > We're dynamically adding and removing routes, whose (ftp) consumers
> > > sometimes take a longer time to start up, and observed the following:
> > >
> > > * (Default/Abstract)CamelContext.addRoutes ->
> > > DefaultCamelContext.addRouteDefinitions locks DefaultModel first, and
> > > is then blocked trying to lock DefaultCamelContext
> > > * (Default/Abstract)CamelContext.removeRoute locks DefaultCamelContext
> > > first, and is then blocked trying to lock DefaultModel
> > >
> > > ---
> > >
> > > Found one Java-level deadlock:
> > > =============================
> > >
> > > "pool-2-thread-3":
> > > waiting to lock monitor 0x00007fd21404a130 (object 0x000000068c002a20,
> > > a org.apache.camel.impl.DefaultCamelContext),
> > > which is held by "pool-2-thread-2"
> > >
> > > "pool-2-thread-2":
> > > waiting to lock monitor 0x00007fd22000b0c0 (object 0x000000068c800170,
> > > a org.apache.camel.impl.DefaultModel),
> > > which is held by "pool-2-thread-3"
> > >
> > > ---
> > >
> > >
> > > We are on Camel 3.7, but I was able to reproduce it with
> > > 3.19.0-SNAPSHOT in a JUnit test.
> > >
> > > I haven't found any tickets or mailing list discussions about this
> > > behaviour.
> > > Is this a known issue?
> > >
> > >
> > > Thanks,
> > > Gerda
> > >
> >
> >
> > --
> > Claus Ibsen
> > -----------------
> > http://davsclaus.com @davsclaus
> > Camel in Action 2: https://www.manning.com/ibsen2
>


-- 
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2

Re: Deadlock when concurrently calling addRoutes/removeRoute

Posted by Gerda Ilger <ge...@ecosio.com>.
Hi Claus,

Yes, we're now doing our own locking - essentially just straightening
out the order of locks Camel will need later on in our code.
It feels wrong, though: CamelContext is already putting a lot of
effort into protecting addRoutes/removeRoute-calls.
Streamlining the order in which locks are obtained would fix this very
deadlock - and would probably be beneficial in other places too?

I played around with it a bit: it felt less invasive to lock the model
for the "removeRoute" call than locking CamelContext for the whole
"addRoutes"-call (which is what we're doing now in our code, works as
well), but I know I don't have the big picture.
https://github.com/apache/camel/compare/main...gerdailger:camel:deadlock

Either way, let me quickly thank you all for Camel at this point. It
has been working very well for us for years.

Kind regards
Gerda

Am Do., 1. Sept. 2022 um 18:36 Uhr schrieb Claus Ibsen <cl...@gmail.com>:
>
> Hi
>
> Do you own locking if you do concurrent model changes to Camel
>
> On Thu, Sep 1, 2022 at 6:21 PM Gerda Ilger <ge...@ecosio.com> wrote:
>
> > Dear Camel Community,
> >
> > We ran into a deadlock when concurrently adding and removing routes.
> >
> > We're dynamically adding and removing routes, whose (ftp) consumers
> > sometimes take a longer time to start up, and observed the following:
> >
> > * (Default/Abstract)CamelContext.addRoutes ->
> > DefaultCamelContext.addRouteDefinitions locks DefaultModel first, and
> > is then blocked trying to lock DefaultCamelContext
> > * (Default/Abstract)CamelContext.removeRoute locks DefaultCamelContext
> > first, and is then blocked trying to lock DefaultModel
> >
> > ---
> >
> > Found one Java-level deadlock:
> > =============================
> >
> > "pool-2-thread-3":
> > waiting to lock monitor 0x00007fd21404a130 (object 0x000000068c002a20,
> > a org.apache.camel.impl.DefaultCamelContext),
> > which is held by "pool-2-thread-2"
> >
> > "pool-2-thread-2":
> > waiting to lock monitor 0x00007fd22000b0c0 (object 0x000000068c800170,
> > a org.apache.camel.impl.DefaultModel),
> > which is held by "pool-2-thread-3"
> >
> > ---
> >
> >
> > We are on Camel 3.7, but I was able to reproduce it with
> > 3.19.0-SNAPSHOT in a JUnit test.
> >
> > I haven't found any tickets or mailing list discussions about this
> > behaviour.
> > Is this a known issue?
> >
> >
> > Thanks,
> > Gerda
> >
>
>
> --
> Claus Ibsen
> -----------------
> http://davsclaus.com @davsclaus
> Camel in Action 2: https://www.manning.com/ibsen2

Re: Deadlock when concurrently calling addRoutes/removeRoute

Posted by Claus Ibsen <cl...@gmail.com>.
Hi

Do you own locking if you do concurrent model changes to Camel

On Thu, Sep 1, 2022 at 6:21 PM Gerda Ilger <ge...@ecosio.com> wrote:

> Dear Camel Community,
>
> We ran into a deadlock when concurrently adding and removing routes.
>
> We're dynamically adding and removing routes, whose (ftp) consumers
> sometimes take a longer time to start up, and observed the following:
>
> * (Default/Abstract)CamelContext.addRoutes ->
> DefaultCamelContext.addRouteDefinitions locks DefaultModel first, and
> is then blocked trying to lock DefaultCamelContext
> * (Default/Abstract)CamelContext.removeRoute locks DefaultCamelContext
> first, and is then blocked trying to lock DefaultModel
>
> ---
>
> Found one Java-level deadlock:
> =============================
>
> "pool-2-thread-3":
> waiting to lock monitor 0x00007fd21404a130 (object 0x000000068c002a20,
> a org.apache.camel.impl.DefaultCamelContext),
> which is held by "pool-2-thread-2"
>
> "pool-2-thread-2":
> waiting to lock monitor 0x00007fd22000b0c0 (object 0x000000068c800170,
> a org.apache.camel.impl.DefaultModel),
> which is held by "pool-2-thread-3"
>
> ---
>
>
> We are on Camel 3.7, but I was able to reproduce it with
> 3.19.0-SNAPSHOT in a JUnit test.
>
> I haven't found any tickets or mailing list discussions about this
> behaviour.
> Is this a known issue?
>
>
> Thanks,
> Gerda
>


-- 
Claus Ibsen
-----------------
http://davsclaus.com @davsclaus
Camel in Action 2: https://www.manning.com/ibsen2