You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by RickSnz <ri...@altran.es> on 2011/09/20 14:25:11 UTC

Failover reconnection questions

Hello,

I've just started using Activemq so please excuse me if this has been
already answered. 

I'm using failover with two brokers, host1 and host2, and this connection
uri: 

failover:(tcp://host1:61617,tcp://host2:61617)

So the consumer randomly connects to one of the brokers. When i stop that
broker, the consumer reconnects to the other one. However, if after this i
stop the second one, the program stops instead of trying to reconnect. It
launches the same exception as when it successfully reconnects, and tries to
reconnect, but the program stops. Here are the traces after it connected to
host 2,  reconnected to host1 and i stopped it:

14:19:27.433 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
o.a.a.transport.tcp.TcpTransport - Stopping transport tcp://host1/ip1:61617
14:19:27.433 [ActiveMQ Transport: tcp://host1/ip1:61617] WARN 
o.a.a.t.failover.FailoverTransport - Transport (host1/ip1:61617) failed to
tcp://host1:61617 , attempting to automatically reconnect due to:
java.io.EOFException
14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
o.a.a.t.failover.FailoverTransport - Transport failed with the following
exception:
java.io.EOFException: null
	at java.io.DataInputStream.readInt(DataInputStream.java:375) ~[na:1.6.0_23]
	at
org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
~[activemq-core-5.5.0.jar:5.5.0]
	at
org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
~[activemq-core-5.5.0.jar:5.5.0]
	at
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
~[activemq-core-5.5.0.jar:5.5.0]
	at
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
~[activemq-core-5.5.0.jar:5.5.0]
	at java.lang.Thread.run(Thread.java:662) [na:1.6.0_23]
14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
o.apache.activemq.ActiveMQConnection - transport interrupted, dispatchers: 1
14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
o.apache.activemq.ActiveMQConnection - notified failover transport
(unconnected) of pending interruption processing for:
ID:28APO0174-2998-1316521115351-0:1
14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
Scheduler] DEBUG o.a.activemq.ActiveMQMessageConsumer -
ID:28APO0174-2998-1316521115351-0:1:1:1 clearing dispatched list (0) on
transport interrupt
14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
Scheduler] DEBUG o.apache.activemq.ActiveMQConnection -
transportInterruptionProcessingComplete for:
ID:28APO0174-2998-1316521115351-0:1
14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
Scheduler] DEBUG o.apache.activemq.ActiveMQConnection - notified failover
transport (unconnected) of interruption completion for:
ID:28APO0174-2998-1316521115351-0:1
14:19:27.448 [ActiveMQ Task-3] DEBUG o.a.a.t.failover.FailoverTransport -
urlList connectionList:[tcp://host2:61617, tcp://host1:61617], from:
[tcp://host1:61617, tcp://host2:61617]
14:19:27.448 [ActiveMQ Task-3] DEBUG o.a.a.t.failover.FailoverTransport -
Attempting connect to: tcp://host2:61617

Also, if before stopping the second broker, i start again the first one, the
same happens, the program crashes.

I'd like to know if this is the right behaviour and, in that case, if
something can be done to make it work as i want.



--
View this message in context: http://activemq.2283324.n4.nabble.com/Failover-reconnection-questions-tp3826592p3826592.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Failover reconnection questions

Posted by RickSnz <ri...@altran.es>.
Thanks for your help. Unfortunately, TRACE level doesnt show anything
relevant. However, i've been trying different things and it seems this
problem happens only after the first 10 seconds of the consumer's activity.
I mean, if i make all the process i explained really fast (i never did
before my first post), it works correctly, but after 10 seconds (since the
beginning, not since last connection), it stops trying to reconnect and the
activity stops.

By the way, if i stop the first host after those 10 seconds, even the first
reconection fails.

I've tried different values for both ReconnectDelay values and the same
happens.

Any idea?

--
View this message in context: http://activemq.2283324.n4.nabble.com/Failover-reconnection-questions-tp3826592p3827215.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Failover reconnection questions

Posted by RickSnz <ri...@altran.es>.
Any idea on this problem? What i wrote about the 10 seconds made me look for
configuration parameters that are 10000 ms by default, and i found
maxInactivityDurationInitalDelay. I've tried changing it in the consumer
uri:


failover://(tcp://host1:61616?wireFormat.maxInactivityDurationInitalDelay=99999,tcp://host2:61616?wireFormat.maxInactivityDurationInitalDelay=99999)

But it seems it must be changed in the broker too, because i get this output
(only information related to this problem):

2011-09-22 11:38:06,855 DEBUG o.a.a.t.WireFormatNegotiator
[WireFormatNegotiator.java:82] Sending: WireFormatInfo { version=7,
properties={CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false,
MaxInactivityDurationInitalDelay=99999, TcpNoDelayEnabled=true,
MaxInactivityDuration=30000, TightEncodingEnabled=true,
StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
2011-09-22 11:38:06,917 DEBUG o.a.a.t.InactivityMonitor
[InactivityMonitor.java:331] Using min of local: WireFormatInfo { version=7,
properties={CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false,
MaxInactivityDurationInitalDelay=99999, TcpNoDelayEnabled=true,
MaxInactivityDuration=30000, TightEncodingEnabled=true,
StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} and remote: WireFormatInfo
{ version=7, properties={CacheSize=1024, CacheEnabled=true,
SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000,
TcpNoDelayEnabled=true, MaxInactivityDuration=30000,
TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
2011-09-22 11:38:06,917 DEBUG o.a.a.t.WireFormatNegotiator
[WireFormatNegotiator.java:118] Received WireFormat: WireFormatInfo {
version=7, properties={CacheSize=1024, CacheEnabled=true,
SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000,
TcpNoDelayEnabled=true, MaxInactivityDuration=30000,
TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
2011-09-22 11:38:06,917 DEBUG o.a.a.t.WireFormatNegotiator
[WireFormatNegotiator.java:125] tcp://host1/10.95.177.237:61617 before
negotiation: OpenWireFormat{version=7, cacheEnabled=false,
stackTraceEnabled=false, tightEncodingEnabled=false,
sizePrefixDisabled=false}
2011-09-22 11:38:06,917 DEBUG o.a.a.t.WireFormatNegotiator
[WireFormatNegotiator.java:140] tcp://host1/10.95.177.237:61617 after
negotiation: OpenWireFormat{version=7, cacheEnabled=true,
stackTraceEnabled=true, tightEncodingEnabled=true, sizePrefixDisabled=false}

I would apreciate any help on this, i'm completely stuck if i cant get
failover to work properly. At least help changing this parameter to see if
it's actually my problem.

--
View this message in context: http://activemq.2283324.n4.nabble.com/Failover-reconnection-questions-tp3826592p3832930.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Failover reconnection questions

Posted by RickSnz <ri...@altran.es>.
Thanks for your help. Unfortunately, TRACE level doesnt show anything
relevant. However, i've been trying different things and it seems this
problem happens only after the first 10 seconds of the consumer's activity.
I mean, if i make all the process i explained really fast (i never did
before my first post), it works correctly, but after 10 seconds (since the
beginning, not since last connection), it stops trying to reconnect and the
activity stops.

By the way, if i stop the first host after those 10 seconds, even the first
reconection fails.

I've tried different values for both ReconnectDelay values and the same
happens.

Any idea? 

--
View this message in context: http://activemq.2283324.n4.nabble.com/Failover-reconnection-questions-tp3826592p3827219.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Re: Failover reconnection questions

Posted by Gary Tully <ga...@gmail.com>.
you will get more information with TRACE level logging for FailoverTransport

There are a bunch of retry config options but the default is to
continue to retry for ever.
If you want to see what it does/can do, the source is at:
http://svn.apache.org/viewvc/activemq/trunk/activemq-core/src/main/java/org/apache/activemq/transport/failover/FailoverTransport.java?view=markup

On 20 September 2011 13:25, RickSnz <ri...@altran.es> wrote:
> Hello,
>
> I've just started using Activemq so please excuse me if this has been
> already answered.
>
> I'm using failover with two brokers, host1 and host2, and this connection
> uri:
>
> failover:(tcp://host1:61617,tcp://host2:61617)
>
> So the consumer randomly connects to one of the brokers. When i stop that
> broker, the consumer reconnects to the other one. However, if after this i
> stop the second one, the program stops instead of trying to reconnect. It
> launches the same exception as when it successfully reconnects, and tries to
> reconnect, but the program stops. Here are the traces after it connected to
> host 2,  reconnected to host1 and i stopped it:
>
> 14:19:27.433 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
> o.a.a.transport.tcp.TcpTransport - Stopping transport tcp://host1/ip1:61617
> 14:19:27.433 [ActiveMQ Transport: tcp://host1/ip1:61617] WARN
> o.a.a.t.failover.FailoverTransport - Transport (host1/ip1:61617) failed to
> tcp://host1:61617 , attempting to automatically reconnect due to:
> java.io.EOFException
> 14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
> o.a.a.t.failover.FailoverTransport - Transport failed with the following
> exception:
> java.io.EOFException: null
>        at java.io.DataInputStream.readInt(DataInputStream.java:375) ~[na:1.6.0_23]
>        at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
> ~[activemq-core-5.5.0.jar:5.5.0]
>        at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
> ~[activemq-core-5.5.0.jar:5.5.0]
>        at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
> ~[activemq-core-5.5.0.jar:5.5.0]
>        at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
> ~[activemq-core-5.5.0.jar:5.5.0]
>        at java.lang.Thread.run(Thread.java:662) [na:1.6.0_23]
> 14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
> o.apache.activemq.ActiveMQConnection - transport interrupted, dispatchers: 1
> 14:19:27.448 [ActiveMQ Transport: tcp://host1/ip1:61617] DEBUG
> o.apache.activemq.ActiveMQConnection - notified failover transport
> (unconnected) of pending interruption processing for:
> ID:28APO0174-2998-1316521115351-0:1
> 14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
> Scheduler] DEBUG o.a.activemq.ActiveMQMessageConsumer -
> ID:28APO0174-2998-1316521115351-0:1:1:1 clearing dispatched list (0) on
> transport interrupt
> 14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
> Scheduler] DEBUG o.apache.activemq.ActiveMQConnection -
> transportInterruptionProcessingComplete for:
> ID:28APO0174-2998-1316521115351-0:1
> 14:19:27.448 [ActiveMQConnection[ID:28APO0174-2998-1316521115351-0:1]
> Scheduler] DEBUG o.apache.activemq.ActiveMQConnection - notified failover
> transport (unconnected) of interruption completion for:
> ID:28APO0174-2998-1316521115351-0:1
> 14:19:27.448 [ActiveMQ Task-3] DEBUG o.a.a.t.failover.FailoverTransport -
> urlList connectionList:[tcp://host2:61617, tcp://host1:61617], from:
> [tcp://host1:61617, tcp://host2:61617]
> 14:19:27.448 [ActiveMQ Task-3] DEBUG o.a.a.t.failover.FailoverTransport -
> Attempting connect to: tcp://host2:61617
>
> Also, if before stopping the second broker, i start again the first one, the
> same happens, the program crashes.
>
> I'd like to know if this is the right behaviour and, in that case, if
> something can be done to make it work as i want.
>
>
>
> --
> View this message in context: http://activemq.2283324.n4.nabble.com/Failover-reconnection-questions-tp3826592p3826592.html
> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>



-- 
http://fusesource.com
http://blog.garytully.com