You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by rasmusback <ra...@gmail.com> on 2011/03/22 15:31:19 UTC

Clients can get stuck in a reconnect loop with master-slave brokers

Hi,

I'm using the shared file system, master slave setup with two brokers on
separate servers. My clients are configured to use the failover transport
with a URL like this:
failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false. I've
noticed that the order of the brokers in the failover URL seems to be
significant. If I start broker2 before broker1, so that broker2 becomes the
master and broker1 the slave, clients will get stuck in a reconnect loop
where they keep trying to connect to broker1.

Attached is a junit test case which exhibits the same behavior as my setup.
If the startup order of the brokers is different from their order in the
failover URL, the test will timeout. When the order is the same, the test
will pass.

The slave broker opens a socket, so a tcp connection is possible to it even
though the broker functionality isn't enabled. This might be what is
confusing the failover transport.

I'm not quite sure if my broker configuration is incorrect or if this is a
bug (or feature) in a master slave setup, so any help is much appreciated.
I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.

   Rasmus

http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
FailoverTest.java 

--
View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Clients can get stuck in a reconnect loop with master-slave brokers

Posted by Gary Tully <ga...@gmail.com>.
added some sample code to
https://cwiki.apache.org/confluence/display/ACTIVEMQ/How+do+I+embed+a+Broker+inside+a+Connection

On 29 March 2011 15:08, rasmusback <ra...@gmail.com> wrote:
> Hi,
>
> Thanks, that does indeed fix the problem. I misunderstood your first
> reply, since I assumed that
>    BrokerService.addConnector(TransportConnector connector);
> would work similarly to
>    BrokerService.addConnector(String bindAddress);
>
> From the source code I can see that addConnector(String bindAddress)
> immediately opens a socket, while addConnector(TransportConnector
> connector) wont until BrokerService.start() is called (and then only
> if it's the master). Maybe this info could be added to the embedded
> broker page and/or the javadocs?
>
> Thanks for your help,
>    Rasmus
>
> On Mon, Mar 28, 2011 at 2:38 PM, Gary Tully [via ActiveMQ]
> <ml...@n4.nabble.com> wrote:
>> using:
>>                     BrokerService broker = new BrokerService();
>>                     TransportConnector connector = new TransportConnector();
>>                     connector.setUri(new URI("tcp://localhost:" + port));
>>                     broker.addConnector(connector);
>>
>> and your test works as expected for me. The slave broker blocks on the
>> store lock acquisition and does not listen on its port till its gets
>> the store lock.
>>
>> On 23 March 2011 08:45, rasmusback <[hidden email]> wrote:
>>> Hi Gary,
>>>
>>> On Tue, Mar 22, 2011 at 5:16 PM, Gary Tully [via ActiveMQ]
>>> <[hidden email]> wrote:
>>>> The test needs some mods to ensure that the slave broker listen port
>>>> is only started when the broker becomes the baster.
>>>>
>>>> In code, the addition of the transportConnector needs to be:
>>>>
>>>>         // lazy create transport connector on start completion
>>>>         TransportConnector connector = new TransportConnector();
>>>>         connector.setUri(new URI("tcp://localhost:" + listenPort));
>>>>         broker.addConnector(connector);
>>>
>>> Ok, so I need to add a check to the broker initialization before
>>> adding the connector. There's a waitUntilStarted method in
>>> BrokerService, is this the one to use? A quick test with
>>>
>>> broker.start();
>>> broker.waitUntilStarted();
>>> broker.addConnector("tcp://localhost:"+port);
>>>
>>> did not make the test pass. I'm looking at the embedded broker FAQ
>>> page and the BrokerService API documentation, but it doesn't provide
>>> much help for this scenario.
>>>
>>>> In cases where failover needs to abort you can configure the
>>>> maxReconnectAttempts to be > 0 and it will fail with an exception
>>>> after X attempts.
>>>
>>> Yep, I tried to keep the amount of options to a minimum in the test
>>> case. In the case I'm describing there is a broker up and running, the
>>> failover transport just doesn't get around to connecting to it.
>>>
>>>> On 22 March 2011 14:31, rasmusback <[hidden email]> wrote:
>>>>> Hi,
>>>>>
>>>>> I'm using the shared file system, master slave setup with two brokers on
>>>>> separate servers. My clients are configured to use the failover
>>>>> transport
>>>>> with a URL like this:
>>>>> failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false.
>>>>> I've
>>>>> noticed that the order of the brokers in the failover URL seems to be
>>>>> significant. If I start broker2 before broker1, so that broker2 becomes
>>>>> the
>>>>> master and broker1 the slave, clients will get stuck in a reconnect loop
>>>>> where they keep trying to connect to broker1.
>>>>>
>>>>> Attached is a junit test case which exhibits the same behavior as my
>>>>> setup.
>>>>> If the startup order of the brokers is different from their order in the
>>>>> failover URL, the test will timeout. When the order is the same, the
>>>>> test
>>>>> will pass.
>>>>>
>>>>> The slave broker opens a socket, so a tcp connection is possible to it
>>>>> even
>>>>> though the broker functionality isn't enabled. This might be what is
>>>>> confusing the failover transport.
>>>>>
>>>>> I'm not quite sure if my broker configuration is incorrect or if this is
>>>>> a
>>>>> bug (or feature) in a master slave setup, so any help is much
>>>>> appreciated.
>>>>> I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.
>>>>>
>>>>>   Rasmus
>>>>>
>>>>> http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
>>>>> FailoverTest.java
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>>
>>>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
>>>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>>>
>>>>
>>>>
>>>> --
>>>> http://blog.garytully.com
>>>> http://fusesource.com
>>>>
>>>>
>>>> ________________________________
>>>> If you reply to this email, your message will be added to the discussion
>>>> below:
>>>>
>>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396678.html
>>>> To unsubscribe from Clients can get stuck in a reconnect loop with
>>>> master-slave brokers, click here.
>>>
>>>
>>> --
>>> View this message in context:
>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3398869.html
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
>>
>> --
>> http://blog.garytully.com
>> http://fusesource.com
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3411416.html
>> To unsubscribe from Clients can get stuck in a reconnect loop with
>> master-slave brokers, click here.
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3414907.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.



-- 
http://blog.garytully.com
http://fusesource.com

Re: Clients can get stuck in a reconnect loop with master-slave brokers

Posted by rasmusback <ra...@gmail.com>.
Hi,

Thanks, that does indeed fix the problem. I misunderstood your first
reply, since I assumed that
    BrokerService.addConnector(TransportConnector connector);
would work similarly to
    BrokerService.addConnector(String bindAddress);

>From the source code I can see that addConnector(String bindAddress)
immediately opens a socket, while addConnector(TransportConnector
connector) wont until BrokerService.start() is called (and then only
if it's the master). Maybe this info could be added to the embedded
broker page and/or the javadocs?

Thanks for your help,
    Rasmus

On Mon, Mar 28, 2011 at 2:38 PM, Gary Tully [via ActiveMQ]
<ml...@n4.nabble.com> wrote:
> using:
>                     BrokerService broker = new BrokerService();
>                     TransportConnector connector = new TransportConnector();
>                     connector.setUri(new URI("tcp://localhost:" + port));
>                     broker.addConnector(connector);
>
> and your test works as expected for me. The slave broker blocks on the
> store lock acquisition and does not listen on its port till its gets
> the store lock.
>
> On 23 March 2011 08:45, rasmusback <[hidden email]> wrote:
>> Hi Gary,
>>
>> On Tue, Mar 22, 2011 at 5:16 PM, Gary Tully [via ActiveMQ]
>> <[hidden email]> wrote:
>>> The test needs some mods to ensure that the slave broker listen port
>>> is only started when the broker becomes the baster.
>>>
>>> In code, the addition of the transportConnector needs to be:
>>>
>>>         // lazy create transport connector on start completion
>>>         TransportConnector connector = new TransportConnector();
>>>         connector.setUri(new URI("tcp://localhost:" + listenPort));
>>>         broker.addConnector(connector);
>>
>> Ok, so I need to add a check to the broker initialization before
>> adding the connector. There's a waitUntilStarted method in
>> BrokerService, is this the one to use? A quick test with
>>
>> broker.start();
>> broker.waitUntilStarted();
>> broker.addConnector("tcp://localhost:"+port);
>>
>> did not make the test pass. I'm looking at the embedded broker FAQ
>> page and the BrokerService API documentation, but it doesn't provide
>> much help for this scenario.
>>
>>> In cases where failover needs to abort you can configure the
>>> maxReconnectAttempts to be > 0 and it will fail with an exception
>>> after X attempts.
>>
>> Yep, I tried to keep the amount of options to a minimum in the test
>> case. In the case I'm describing there is a broker up and running, the
>> failover transport just doesn't get around to connecting to it.
>>
>>> On 22 March 2011 14:31, rasmusback <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> I'm using the shared file system, master slave setup with two brokers on
>>>> separate servers. My clients are configured to use the failover
>>>> transport
>>>> with a URL like this:
>>>> failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false.
>>>> I've
>>>> noticed that the order of the brokers in the failover URL seems to be
>>>> significant. If I start broker2 before broker1, so that broker2 becomes
>>>> the
>>>> master and broker1 the slave, clients will get stuck in a reconnect loop
>>>> where they keep trying to connect to broker1.
>>>>
>>>> Attached is a junit test case which exhibits the same behavior as my
>>>> setup.
>>>> If the startup order of the brokers is different from their order in the
>>>> failover URL, the test will timeout. When the order is the same, the
>>>> test
>>>> will pass.
>>>>
>>>> The slave broker opens a socket, so a tcp connection is possible to it
>>>> even
>>>> though the broker functionality isn't enabled. This might be what is
>>>> confusing the failover transport.
>>>>
>>>> I'm not quite sure if my broker configuration is incorrect or if this is
>>>> a
>>>> bug (or feature) in a master slave setup, so any help is much
>>>> appreciated.
>>>> I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.
>>>>
>>>>   Rasmus
>>>>
>>>> http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
>>>> FailoverTest.java
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
>>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>>
>>>
>>>
>>> --
>>> http://blog.garytully.com
>>> http://fusesource.com
>>>
>>>
>>> ________________________________
>>> If you reply to this email, your message will be added to the discussion
>>> below:
>>>
>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396678.html
>>> To unsubscribe from Clients can get stuck in a reconnect loop with
>>> master-slave brokers, click here.
>>
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3398869.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>
>
> --
> http://blog.garytully.com
> http://fusesource.com
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3411416.html
> To unsubscribe from Clients can get stuck in a reconnect loop with
> master-slave brokers, click here.


--
View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3414907.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Clients can get stuck in a reconnect loop with master-slave brokers

Posted by Gary Tully <ga...@gmail.com>.
using:
                    BrokerService broker = new BrokerService();
                    TransportConnector connector = new TransportConnector();
                    connector.setUri(new URI("tcp://localhost:" + port));
                    broker.addConnector(connector);

and your test works as expected for me. The slave broker blocks on the
store lock acquisition and does not listen on its port till its gets
the store lock.

On 23 March 2011 08:45, rasmusback <ra...@gmail.com> wrote:
> Hi Gary,
>
> On Tue, Mar 22, 2011 at 5:16 PM, Gary Tully [via ActiveMQ]
> <ml...@n4.nabble.com> wrote:
>> The test needs some mods to ensure that the slave broker listen port
>> is only started when the broker becomes the baster.
>>
>> In code, the addition of the transportConnector needs to be:
>>
>>         // lazy create transport connector on start completion
>>         TransportConnector connector = new TransportConnector();
>>         connector.setUri(new URI("tcp://localhost:" + listenPort));
>>         broker.addConnector(connector);
>
> Ok, so I need to add a check to the broker initialization before
> adding the connector. There's a waitUntilStarted method in
> BrokerService, is this the one to use? A quick test with
>
> broker.start();
> broker.waitUntilStarted();
> broker.addConnector("tcp://localhost:"+port);
>
> did not make the test pass. I'm looking at the embedded broker FAQ
> page and the BrokerService API documentation, but it doesn't provide
> much help for this scenario.
>
>> In cases where failover needs to abort you can configure the
>> maxReconnectAttempts to be > 0 and it will fail with an exception
>> after X attempts.
>
> Yep, I tried to keep the amount of options to a minimum in the test
> case. In the case I'm describing there is a broker up and running, the
> failover transport just doesn't get around to connecting to it.
>
>> On 22 March 2011 14:31, rasmusback <[hidden email]> wrote:
>>> Hi,
>>>
>>> I'm using the shared file system, master slave setup with two brokers on
>>> separate servers. My clients are configured to use the failover transport
>>> with a URL like this:
>>> failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false. I've
>>> noticed that the order of the brokers in the failover URL seems to be
>>> significant. If I start broker2 before broker1, so that broker2 becomes
>>> the
>>> master and broker1 the slave, clients will get stuck in a reconnect loop
>>> where they keep trying to connect to broker1.
>>>
>>> Attached is a junit test case which exhibits the same behavior as my
>>> setup.
>>> If the startup order of the brokers is different from their order in the
>>> failover URL, the test will timeout. When the order is the same, the test
>>> will pass.
>>>
>>> The slave broker opens a socket, so a tcp connection is possible to it
>>> even
>>> though the broker functionality isn't enabled. This might be what is
>>> confusing the failover transport.
>>>
>>> I'm not quite sure if my broker configuration is incorrect or if this is a
>>> bug (or feature) in a master slave setup, so any help is much appreciated.
>>> I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.
>>>
>>>   Rasmus
>>>
>>> http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
>>> FailoverTest.java
>>>
>>> --
>>> View this message in context:
>>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>
>>
>>
>> --
>> http://blog.garytully.com
>> http://fusesource.com
>>
>>
>> ________________________________
>> If you reply to this email, your message will be added to the discussion
>> below:
>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396678.html
>> To unsubscribe from Clients can get stuck in a reconnect loop with
>> master-slave brokers, click here.
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3398869.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.



-- 
http://blog.garytully.com
http://fusesource.com

Re: Clients can get stuck in a reconnect loop with master-slave brokers

Posted by rasmusback <ra...@gmail.com>.
Hi Gary,

On Tue, Mar 22, 2011 at 5:16 PM, Gary Tully [via ActiveMQ]
<ml...@n4.nabble.com> wrote:
> The test needs some mods to ensure that the slave broker listen port
> is only started when the broker becomes the baster.
>
> In code, the addition of the transportConnector needs to be:
>
>         // lazy create transport connector on start completion
>         TransportConnector connector = new TransportConnector();
>         connector.setUri(new URI("tcp://localhost:" + listenPort));
>         broker.addConnector(connector);

Ok, so I need to add a check to the broker initialization before
adding the connector. There's a waitUntilStarted method in
BrokerService, is this the one to use? A quick test with

broker.start();
broker.waitUntilStarted();
broker.addConnector("tcp://localhost:"+port);

did not make the test pass. I'm looking at the embedded broker FAQ
page and the BrokerService API documentation, but it doesn't provide
much help for this scenario.

> In cases where failover needs to abort you can configure the
> maxReconnectAttempts to be > 0 and it will fail with an exception
> after X attempts.

Yep, I tried to keep the amount of options to a minimum in the test
case. In the case I'm describing there is a broker up and running, the
failover transport just doesn't get around to connecting to it.

> On 22 March 2011 14:31, rasmusback <[hidden email]> wrote:
>> Hi,
>>
>> I'm using the shared file system, master slave setup with two brokers on
>> separate servers. My clients are configured to use the failover transport
>> with a URL like this:
>> failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false. I've
>> noticed that the order of the brokers in the failover URL seems to be
>> significant. If I start broker2 before broker1, so that broker2 becomes
>> the
>> master and broker1 the slave, clients will get stuck in a reconnect loop
>> where they keep trying to connect to broker1.
>>
>> Attached is a junit test case which exhibits the same behavior as my
>> setup.
>> If the startup order of the brokers is different from their order in the
>> failover URL, the test will timeout. When the order is the same, the test
>> will pass.
>>
>> The slave broker opens a socket, so a tcp connection is possible to it
>> even
>> though the broker functionality isn't enabled. This might be what is
>> confusing the failover transport.
>>
>> I'm not quite sure if my broker configuration is incorrect or if this is a
>> bug (or feature) in a master slave setup, so any help is much appreciated.
>> I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.
>>
>>   Rasmus
>>
>> http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
>> FailoverTest.java
>>
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
>
>
> --
> http://blog.garytully.com
> http://fusesource.com
>
>
> ________________________________
> If you reply to this email, your message will be added to the discussion
> below:
> http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396678.html
> To unsubscribe from Clients can get stuck in a reconnect loop with
> master-slave brokers, click here.


--
View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3398869.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Clients can get stuck in a reconnect loop with master-slave brokers

Posted by Gary Tully <ga...@gmail.com>.
The test needs some mods to ensure that the slave broker listen port
is only started when the broker becomes the baster.

In code, the addition of the transportConnector needs to be:

        // lazy create transport connector on start completion
        TransportConnector connector = new TransportConnector();
        connector.setUri(new URI("tcp://localhost:" + listenPort));
        broker.addConnector(connector);

In cases where failover needs to abort you can configure the
maxReconnectAttempts to be > 0 and it will fail with an exception
after X attempts.

On 22 March 2011 14:31, rasmusback <ra...@gmail.com> wrote:
> Hi,
>
> I'm using the shared file system, master slave setup with two brokers on
> separate servers. My clients are configured to use the failover transport
> with a URL like this:
> failover://(tcp://broker1:61616,tcp://broker2:61616)?randomize=false. I've
> noticed that the order of the brokers in the failover URL seems to be
> significant. If I start broker2 before broker1, so that broker2 becomes the
> master and broker1 the slave, clients will get stuck in a reconnect loop
> where they keep trying to connect to broker1.
>
> Attached is a junit test case which exhibits the same behavior as my setup.
> If the startup order of the brokers is different from their order in the
> failover URL, the test will timeout. When the order is the same, the test
> will pass.
>
> The slave broker opens a socket, so a tcp connection is possible to it even
> though the broker functionality isn't enabled. This might be what is
> confusing the failover transport.
>
> I'm not quite sure if my broker configuration is incorrect or if this is a
> bug (or feature) in a master slave setup, so any help is much appreciated.
> I'm using ActiveMQ 5.4.2 and spring-jms 2.5.5.
>
>   Rasmus
>
> http://activemq.2283324.n4.nabble.com/file/n3396540/FailoverTest.java
> FailoverTest.java
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Clients-can-get-stuck-in-a-reconnect-loop-with-master-slave-brokers-tp3396540p3396540.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://blog.garytully.com
http://fusesource.com