You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by smantri <sh...@infor.com> on 2019/04/23 20:39:40 UTC

Failover Transport hangs forever on connection.start() in Master/Slave

Hi, 

I have Master/Slave broker url for failover transport as follows: 
(ssl://brokerurl1:61616,ssl://brokerurl2:61616)?timeout=5000&startupMaxReconnectAttempts=5&maxReconnectAttempts=5 

Client picks up one broker url at random, if the url picked up to connect at
random is slave, the call hangs forever at connection.start(). Above
mentioned failover transport options are not being considered in this case.
I don't want to set randomize option to false. 

Can anyone help understanding how can I make it respond and not hang
forever?Does anyone know the reason for transport options (timeout,
reconnect attempts) not being considered? 

Thanks!




--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: Failover Transport hangs forever on connection.start() in Master/Slave

Posted by smantri <sh...@infor.com>.
Hi,

Following is the stack trace for successful connection to activeMQ:

#########################################
2019-04-30 20:51:52 DEBUG TaskRunnerFactory:91 - Initialized
TaskRunnerFactory[ActiveMQ Task] using ExecutorService:
java.util.concurrent.ThreadPoolExecutor@6504e3b2[Running, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 0]
2019-04-30 20:51:52 DEBUG FailoverTransport:753 - Reconnect was triggered
but transport is not started yet. Wait for start to connect the transport.
2019-04-30 20:51:52 DEBUG FailoverTransport:330 - Started unconnected
2019-04-30 20:51:52 DEBUG FailoverTransport:746 - Waking up reconnect task
2019-04-30 20:51:52 TRACE TaskRunnerFactory:174 - Created thread[ActiveMQ
Task-1]: Thread[ActiveMQ Task-1,5,main]
2019-04-30 20:51:52 TRACE PooledTaskRunner:128 - Running task iteration 0 -
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:52 TRACE FailoverTransport:589 - Waiting for transport to
reconnect..: ConnectionInfo {commandId = 1, responseRequired = true,
connectionId = ID:*****-23291-1556637712547-1:1, clientId =
ID:******-23291-1556637712547-0:1, clientIp = null, userName = ****,
password = *****, brokerPath = null, brokerMasterConnector = false,
manageable = true, clientMaster = true, faultTolerant = true,
failoverReconnect = false}
2019-04-30 20:51:52 DEBUG FailoverTransport:786 - urlList
connectionList:[ssl://master:61616, ssl://slave:61616], from:
[ssl://master:61616, ssl://slave:61616]
2019-04-30 20:51:54 DEBUG FailoverTransport:990 - Attempting  0th  connect
to: ssl://master:61616
2019-04-30 20:51:54 DEBUG WireFormatNegotiator:82 - Sending: WireFormatInfo
{ version=9, properties={TcpNoDelayEnabled=true, SizePrefixDisabled=false,
CacheSize=1024, StackTraceEnabled=true, CacheEnabled=true,
TightEncodingEnabled=true, MaxFrameSize=9223372036854775807,
MaxInactivityDuration=30000, MaxInactivityDurationInitalDelay=10000},
magic=[A,c,t,i,v,e,M,Q]}
2019-04-30 20:51:54 TRACE TcpTransport:192 - TCP consumer thread for
ssl://master/10.39.162.116:61616 starting
2019-04-30 20:51:55 DEBUG FailoverTransport:1000 - Connection established
2019-04-30 20:51:55 DEBUG InactivityMonitor:92 - Using min of local:
WireFormatInfo { version=9, properties={TcpNoDelayEnabled=true,
SizePrefixDisabled=false, CacheSize=1024, StackTraceEnabled=true,
CacheEnabled=true, TightEncodingEnabled=true,
MaxFrameSize=9223372036854775807, MaxInactivityDuration=30000,
MaxInactivityDurationInitalDelay=10000}, magic=[A,c,t,i,v,e,M,Q]} and
remote: WireFormatInfo { version=12, properties={TcpNoDelayEnabled=true,
SizePrefixDisabled=false, CacheSize=1024, ProviderName=ActiveMQ,
StackTraceEnabled=true, PlatformDetails=Java, CacheEnabled=true,
TightEncodingEnabled=true, MaxFrameSize=9223372036854775807,
MaxInactivityDuration=30000, MaxInactivityDurationInitalDelay=10000,
ProviderVersion=5.15.3}, magic=[A,c,t,i,v,e,M,Q]}
2019-04-30 20:51:55 INFO  FailoverTransport:1030 - Successfully connected to
ssl://master:61616
2019-04-30 20:51:55 DEBUG WireFormatNegotiator:118 - Received WireFormat:
WireFormatInfo { version=12, properties={TcpNoDelayEnabled=true,
SizePrefixDisabled=false, CacheSize=1024, ProviderName=ActiveMQ,
StackTraceEnabled=true, PlatformDetails=Java, CacheEnabled=true,
TightEncodingEnabled=true, MaxFrameSize=9223372036854775807,
MaxInactivityDuration=30000, MaxInactivityDurationInitalDelay=10000,
ProviderVersion=5.15.3}, magic=[A,c,t,i,v,e,M,Q]}
2019-04-30 20:51:55 DEBUG WireFormatNegotiator:125 -
ssl://master/10.39.162.116:61616 before negotiation:
OpenWireFormat{version=9, cacheEnabled=false, stackTraceEnabled=false,
tightEncodingEnabled=false, sizePrefixDisabled=false,
maxFrameSize=9223372036854775807}
2019-04-30 20:51:55 TRACE TaskRunnerFactory:174 - Created thread[ActiveMQ
Task-2]: Thread[ActiveMQ Task-2,5,main]
2019-04-30 20:51:55 DEBUG WireFormatNegotiator:140 -
ssl://master/10.39.162.116:61616 after negotiation:
OpenWireFormat{version=9, cacheEnabled=true, stackTraceEnabled=true,
tightEncodingEnabled=true, sizePrefixDisabled=false,
maxFrameSize=9223372036854775807}
2019-04-30 20:51:55 TRACE PooledTaskRunner:49 - Run task done:
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:55 TRACE PooledTaskRunner:128 - Running task iteration 0 -
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:55 TRACE PooledTaskRunner:49 - Run task done:
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
Connection established......
2019-04-30 20:51:56 DEBUG FailoverTransport:746 - Waking up reconnect task
2019-04-30 20:51:56 TRACE PooledTaskRunner:128 - Running task iteration 0 -
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:56 TRACE PooledTaskRunner:49 - Run task done:
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:56 TRACE PooledTaskRunner:128 - Running task iteration 0 -
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
2019-04-30 20:51:56 TRACE PooledTaskRunner:49 - Run task done:
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0
Created producer.....
2019-04-30 20:51:56 TRACE ActiveMQSession:1779 -
ID:****-23291-1556637712547-1:1:1 sending message: ActiveMQTextMessage
{commandId = 0, responseRequired = false, messageId =
ID:****-23291-1556637712547-1:1:1:1:1, originalDestination = null,
originalTransactionId = null, producerId =
ID:****-23291-1556637712547-1:1:1:1, destination =
topic://VirtualTopic.ifsgroups, transactionId = null, expiration = 0,
timestamp = 1556637716299, arrival = 0, brokerInTime = 0, brokerOutTime = 0,
correlationId = null, replyTo = null, persistent = true, type = null,
priority = 4, groupID = null, groupSequence = 0, targetConsumerId = null,
compressed = false, userID = null, content = null, marshalledProperties =
null, dataStructure = null, redeliveryCounter = 0, size = 0, properties =
null, readOnlyProperties = true, readOnlyBody = true, droppable = false,
text = Message from Producer IFS:0:Tue Apr 30 20:51:56 IST 2019}
Message Sent Successfully......
#########################################

Following is the stack trace for unsuccessful connection call to ActiveMQ:
#############################################
2019-04-30 20:54:00 DEBUG TaskRunnerFactory:91 - Initialized
TaskRunnerFactory[ActiveMQ Task] using ExecutorService:
java.util.concurrent.ThreadPoolExecutor@6504e3b2[Running, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 0]
2019-04-30 20:54:00 DEBUG FailoverTransport:753 - Reconnect was triggered
but transport is not started yet. Wait for start to connect the transport.
2019-04-30 20:54:00 DEBUG FailoverTransport:330 - Started unconnected
2019-04-30 20:54:00 DEBUG FailoverTransport:746 - Waking up reconnect task
2019-04-30 20:54:00 TRACE TaskRunnerFactory:174 - Created thread[ActiveMQ
Task-1]: Thread[ActiveMQ Task-1,5,main]
2019-04-30 20:54:00 TRACE PooledTaskRunner:128 - Running task iteration 0 -
org.apache.activemq.transport.failover.FailoverTransport$2@13f707e0                            
2019-04-30 20:54:00 DEBUG FailoverTransport:786 - urlList
connectionList:[ssl://slave:61616, ssl://master:61616], from:
[ssl://master:61616, ssl://slave:61616]                                                                                                          
2019-04-30 20:54:02 DEBUG FailoverTransport:990 - Attempting  0th  connect
to: ssl://slave:61616
2019-04-30 20:54:02 DEBUG WireFormatNegotiator:82 - Sending: WireFormatInfo
{ version=9, properties={TcpNoDelayEnabled=true, SizePrefixDisabled=false,
CacheSize=1024, StackTraceEnabled=true, CacheEnabled=true,
TightEncodingEnabled=true, MaxFrameSize=9223372036854775807,
MaxInactivityDuration=30000, MaxInactivityDurationInitalDelay=10000},
magic=[A,c,t,i,v,e,M,Q]}
2019-04-30 20:54:02 TRACE TcpTransport:192 - TCP consumer thread for
ssl://slave/10.39.163.87:61616 starting
#############################################

I got the reason for transport options not being considered. As application
has not failed, it won't be considering the reconnect attempts. 

Could you please help providing a work around for the above scenario?

Thanks!



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: Failover Transport hangs forever on connection.start() in Master/Slave

Posted by Arthur Naseef <ar...@amlinv.com>.
Can you upload the full stack dump from the client application when this
happens as a GIST on github.com and send the link?  It's OK to filter out
any proprietary bits as long as all of the threads that have ActiveMQ in
the call history are listed.

BTW, the "oneWay" method sends a message over the transport - one way - and
is not involved in reconnect logic itself.  So blocking on that
reconnectMutex is normal given that somehow the reconnect logic is failing
to connect.  BTW with these settings,
startupMaxReconnectAttempts=5&maxReconnectAttempts=5, the transport will
stop attempting to reconnect - is it possible the application has failed
that many times and therefore just given up on attempting to reconnect?

Art


On Thu, Apr 25, 2019 at 7:28 AM smantri <sh...@infor.com> wrote:

> I am using activemq version 5.15.8. I see the  issue when client first time
> tries to connect to activemq broker and if the randomly picked broker url
> happens to be slave, the call is stuck forever in FailoverTransport class
> at
> the following point:
>
> ##############################
>  @Override
>     public void oneway(Object o) throws IOException {
>
>         Command command = (Command) o;
>         Exception error = null;
>         try {
>
>             synchronized (reconnectMutex) {         <<<<<< blocked here
> >>>>>
>
> ##############################
>
> call is stuck at the above point till i make one of the instance(either
> master or slave) for activemq go down.
>
>
>
> mikmela wrote
> > You haven't mentioned a version of your activemq, there were some issues
> > with
> > that in older versions...
> > We're on 5.6.0 and above - no issues...
> > See
> > http://activemq.apache.org/failover-transport-reference
> > &lt;http://activemq.apache.org/failover-transport-reference&gt;
> >
> >
> >
> > --
> > Sent from:
> > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
>
> mikmela wrote
> > You haven't mentioned a version of your activemq, there were some issues
> > with
> > that in older versions...
> > We're on 5.6.0 and above - no issues...
> > See
> > http://activemq.apache.org/failover-transport-reference
> > &lt;http://activemq.apache.org/failover-transport-reference&gt;
> >
> >
> >
> > --
> > Sent from:
> > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
>
> mikmela wrote
> > You haven't mentioned a version of your activemq, there were some issues
> > with
> > that in older versions...
> > We're on 5.6.0 and above - no issues...
> > See
> > http://activemq.apache.org/failover-transport-reference
> > &lt;http://activemq.apache.org/failover-transport-reference&gt;
> >
> >
> >
> > --
> > Sent from:
> > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
>
> mikmela wrote
> > You haven't mentioned a version of your activemq, there were some issues
> > with
> > that in older versions...
> > We're on 5.6.0 and above - no issues...
> > See
> > http://activemq.apache.org/failover-transport-reference
> > &lt;http://activemq.apache.org/failover-transport-reference&gt;
> >
> >
> >
> > --
> > Sent from:
> > http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>
>
>
>
>
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html
>

Re: Failover Transport hangs forever on connection.start() in Master/Slave

Posted by smantri <sh...@infor.com>.
I am using activemq version 5.15.8. I see the  issue when client first time
tries to connect to activemq broker and if the randomly picked broker url
happens to be slave, the call is stuck forever in FailoverTransport class at
the following point:

##############################
 @Override
    public void oneway(Object o) throws IOException {

        Command command = (Command) o;
        Exception error = null;
        try {

            synchronized (reconnectMutex) {         <<<<<< blocked here
>>>>>

##############################

call is stuck at the above point till i make one of the instance(either
master or slave) for activemq go down. 



mikmela wrote
> You haven't mentioned a version of your activemq, there were some issues
> with
> that in older versions...
> We're on 5.6.0 and above - no issues...  
> See
> http://activemq.apache.org/failover-transport-reference
> &lt;http://activemq.apache.org/failover-transport-reference&gt;  
> 
> 
> 
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html


mikmela wrote
> You haven't mentioned a version of your activemq, there were some issues
> with
> that in older versions...
> We're on 5.6.0 and above - no issues...  
> See
> http://activemq.apache.org/failover-transport-reference
> &lt;http://activemq.apache.org/failover-transport-reference&gt;  
> 
> 
> 
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html


mikmela wrote
> You haven't mentioned a version of your activemq, there were some issues
> with
> that in older versions...
> We're on 5.6.0 and above - no issues...  
> See
> http://activemq.apache.org/failover-transport-reference
> &lt;http://activemq.apache.org/failover-transport-reference&gt;  
> 
> 
> 
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html


mikmela wrote
> You haven't mentioned a version of your activemq, there were some issues
> with
> that in older versions...
> We're on 5.6.0 and above - no issues...  
> See
> http://activemq.apache.org/failover-transport-reference
> &lt;http://activemq.apache.org/failover-transport-reference&gt;  
> 
> 
> 
> --
> Sent from:
> http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html





--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: Failover Transport hangs forever on connection.start() in Master/Slave

Posted by mikmela <mi...@yahoo.com>.
You haven't mentioned a version of your activemq, there were some issues with
that in older versions...
We're on 5.6.0 and above - no issues...  
See
http://activemq.apache.org/failover-transport-reference
<http://activemq.apache.org/failover-transport-reference>  



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html