You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@servicemix.apache.org by Ryan Moquin <fr...@gmail.com> on 2010/04/30 00:39:36 UTC

Question about message exchanges sent to a clustered service that had crashed

I've ran into a situation several times where an asynchonouse message
exchange hangs when it is sent to a service that runs on a clustered
servicevicemix instance that has unexpected lost network connection or has
crashed.  When a servicemix instance shuts down normally, it broadcasts out
service deregistrations to all other clustered Servicemix instances.  If a
clustered servicemix instance crashes or loses connectivity to the rest of
the cluster, then they have no way of knowing that they can't route messages
to that service anymore.  If a message is routed to a service on that
unreachable Servicemix instance, then the message exchange seems to hang in
limbo.  The service that sent it is never notified that the messageexchange
can't be delivered.  You could of course, after a certain period of time,
assume that the message exchange isn't going to be returned, but I don't
think that will free up the thread being consumed by the waiting message
exchange.

It doesn't appear that Servicemix has a default way of detecting that a
clustered service is no longer routable to so that a message exchange
doesn't end up hanging in limbo, or do MessageExchanges have timeouts but
they are set by default to a very long duration?

Any help on how to handle this situation properly would be appreciated.  It
would be nice if there was a way for servicemix instances to detect when an
ActiveMQ NetworkConnector has lost it's connection to another Servicemix
instance and then deregister those services so that messages can't be routed
to them anymore....

Ryan