You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by khandelwalanuj <an...@gmail.com> on 2017/09/18 14:25:55 UTC

"allowLinkStealing" as a client configuration

Hi,

I am using ActiveMQv5.14. 

We observed issues with durable java subscribers using TCP connector that
lot of time after client/broker restart clients were not able to connect
back to the broker and see below exception *"connected from null"*

Lot of time after client restart, we get below exception in logs : 
[2017-09-16 14:15:59.540-0400] [ActiveMQ NIO Worker 3]
[org.apache.activemq.broker.TransportConnection] [WARN]
[TransportConnection.java:org.apache.activemq.broker.TransportConnection:processAddConnection()::858]
[Failed to add Connection
id=ID:app1a.prod.c.ia55.net-46199-1505585017494-3:2,
clientId=integratedapp1a.prod.c.ia55.net_conn_listener due to
javax.jms.InvalidClientIDException: Broker: mqbrokerbaly - Client:
integratedapp1a.prod.c.ia55.net_conn_listener already *connected from null]
*

When we check in Jconsole, we don;t find any clients on the topic with the
same client id. 

*To avoid this I wanted to use "allowLinkStealing" configuration in
ActiveMQ, I found it working when I specify this in broker transport TCP
configuration. But we don't want to enable it for all, Is there a way we can
specify this from the client side instead of add in the broker connector
configuration ? 
*


Thanks,
Anuj




--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by bansalp <bp...@gmail.com>.
I also faced this issue. I am using ActiveMQv5.14. 

Steps to reproduce:

 * Connect a durable consumer with some client-id say CLIENT1
 * Start another durable consumer with same client-id (CLIENT1)
 * Stop the first consumer.

We expected that second consumer should get connected automatically but that
is not the case. 

Second consumer get "Client: CLIENT1 already connected from null."

We are observing this issue when using
"org.apache.activemq.pool.PooledConnectionFactory" but not when using
"org.apache.activemq.ActiveMQConnectionFactory".

Can we check this?



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by Tim Bain <tb...@alumni.duke.edu>.
Hmmm, interesting. That means one of two things:

1. Following the broker restart, multiple client processes/threads are
trying to connect using the same client ID. For a given client ID, the
first process/thread to attempt the connection will succeed, and the later
ones will fail. I would expect that enabling link stealing would result in
either a stalemate where all threads keep detecting that the link was
stolen from them and trying to steal it back, or a situation where the last
process/thread steals the link and keeps it. I expect the former, but have
never used link stealing so I'm not sure if that's actually how it works.
2. Following a broker restart, only one process/thread is using a given
client ID, but before it even connects the broker believes that a client is
already connected for that client ID and rejects all connections for that
client ID, which would be a bug. Enabling link stealing would result in
normal operation.

Based on what you've previously observed, can you tell which of the two
situations is actually occurring? I'd be looking for something like "I
examined all the connections in the MBeans tab of JConsole when this
problem was occurring, and I determined that none of them matched the
client ID that was getting rejected" to show conclusively that it was
situation #2 rather than #1, rather than something like "I'm sure I don't
have two process/threads for the same client ID."

If it's #2, is your organization open to attaching a debugger to the
production broker the next time this occurs, to allow you to step through
the broker initialization code to help us figure out what's going on?

Tim

On Sep 28, 2017 5:14 AM, "khandelwalanuj" <an...@gmail.com>
wrote:

> > Does restarting the broker without enabling link stealing also let you
> work
> > around the problem?
> Unfortunately No.  Restarting broker also doesn't solve the problem. Only
> when you restart it with link stealing resolves the issue.
>
> Thanks,
> Anuj
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>

Re: "allowLinkStealing" as a client configuration

Posted by khandelwalanuj <an...@gmail.com>.
> Does restarting the broker without enabling link stealing also let you work 
> around the problem?
Unfortunately No.  Restarting broker also doesn't solve the problem. Only
when you restart it with link stealing resolves the issue.

Thanks,
Anuj



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by Tim Bain <tb...@alumni.duke.edu>.
Does restarting the broker without enabling link stealing also let you work
around the problem? Based on what you've described as working vs. not, I
would expect that to be a viable workaround. And if it's not, that might
give more information about what's actually going on.

Tim

On Sep 27, 2017 11:05 PM, "khandelwalanuj" <an...@gmail.com>
wrote:

> Hi,
>
> This problem is non-reproducible. I have tried many times with different
> configuration but not able to reproduce it. Even our clients using ActiveMQ
> sometimes face this suddenly in prod environment where dev/qa works fine.
>
> The worst part about this issues is to fix it. This does not get resolved
> even by restarting the client application. Only way to fix it is:
>  1. either ask clients to change the client id (in which they loose all the
> messages sent in between)
>  2. Or restart ActiveMQ broker by enabling "allowLinkStealing"
> configuration.
>
> We have not only faced this problem now but also earlier when we were using
> ActiveMQv5.10 couple of years back.
>
>
> On the client side we use pooledCOnnectionFactory with maxConnection=1 as
> suggested by Tim in
> (http://activemq.2283324.n4.nabble.com/javax-jms-
> InvalidClientIDException-for-durable-subscription-on-
> broker-restart-td4684381.html).
>
> Thanks,
> Anuj
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>

Re: "allowLinkStealing" as a client configuration

Posted by khandelwalanuj <an...@gmail.com>.
Hi,

This problem is non-reproducible. I have tried many times with different
configuration but not able to reproduce it. Even our clients using ActiveMQ
sometimes face this suddenly in prod environment where dev/qa works fine.

The worst part about this issues is to fix it. This does not get resolved
even by restarting the client application. Only way to fix it is:
 1. either ask clients to change the client id (in which they loose all the
messages sent in between)
 2. Or restart ActiveMQ broker by enabling "allowLinkStealing"
configuration.

We have not only faced this problem now but also earlier when we were using
ActiveMQv5.10 couple of years back.


On the client side we use pooledCOnnectionFactory with maxConnection=1 as
suggested by Tim in
(http://activemq.2283324.n4.nabble.com/javax-jms-InvalidClientIDException-for-durable-subscription-on-broker-restart-td4684381.html). 

Thanks,
Anuj



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by Tim Bain <tb...@alumni.duke.edu>.
I finally got time on a computer (not my phone) to look at this.

First, the reason that you're not seeing a stack trace is our fault, not
the result of the JVM optimizing away the stack trace. If you look at line
858 of
https://github.com/apache/activemq/blob/activemq-5.14.x/activemq-broker/src/main/java/org/apache/activemq/broker/TransportConnection.java,
you'll see that we're including the exception object inline in the message
(which just calls toString() without providing stack trace info) rather
than passing it as a second argument after the message.

Thankfully there's only one place in the code where that error message
occurs, which is
activemq-broker/src/main/java/org/apache/activemq/broker/region/RegionBroker.java
<https://github.com/apache/activemq/blob/master/activemq-broker/src/main/java/org/apache/activemq/broker/region/RegionBroker.java>
line 255. Unfortunately, the null value at that line is the result of
evaluating oldContext.getConnection().getRemoteAddress()), and it's not
immediately obvious why that address either was never set to a non-null
value or was later overwritten with a null value. It's possible that this
null value will turn out to be relevant to your actual question, but for
now let's treat it as unrelated and see if we can solve your primary
question.

Are you able to reproduce this problem in a non-operational environment
where you can attach a debugger? I think the next step here will be to use
a debugger to examine the logic for disconnecting the old connection and
then re-connecting the new one, so if you don't yet have a way to do that,
that would be your next step. Either that, or you could write a unit test
that reproduces the problem against an embedded broker, and then someone
other than you could step through that test with a debugger.

One other question: does the re-connection work if you wait longer than
5-10 seconds? If you wait a minute, or five, does the re-connection
succeed? I'm not suggesting that's the desired behavior, I just want to
better characterize whether this is a timing issue/race condition, or
whether the bad behavior is unrelated to timing.

Tim

On Thu, Sep 21, 2017 at 6:44 AM, khandelwalanuj <
anuj.cool.khandelwal@gmail.com> wrote:

> I can only see warning message in broker logs:
>
> [2017-09-16 14:15:59.540-0400] [ActiveMQ NIO Worker 3]
> [org.apache.activemq.broker.TransportConnection] [WARN]
> [TransportConnection.java:org.apache.activemq.broker.TransportConnection:
> processAddConnection()::858]
> [Failed to add Connection
> id=ID:app1a.prod.c.ia55.net-46199-1505585017494-3:2,
> clientId=integratedapp1a.prod.c.ia55.net_conn_listener due to
> javax.jms.InvalidClientIDException: Broker: mqbrokerbaly - Client:
> integratedapp1a.prod.c.ia55.net_conn_listener already *connected from
> null]
> *
>
> On the client side the exception received is :
> [ 2017-09-16 18:01:59,897 [EngineTriggerSide2MessageListenerContainer-1]
> []
> listener.DefaultMessageListenerContainer.refreshConnectionUntilSuccessf
> ul():907
> ERROR]: Could
> not refresh JMS Connection for destination 'topic://RecNotification' -
> retrying in 5000 ms. Cause: Broker: mqbroker- Client:
> integratedapp1a.prod.c.ia55.net_conn_listener already connected from null
> [ 2017-09-16 18:01:59,974 [ActiveMQ Transport:
> tcp://mqbroker/*:*:*:*:61616@57384] []
> failover.FailoverTransport.handleTransportFailure():260  WARN ]: Transport
> (tcp://mqbrok
> er/10.12.19.62:61616@57384) failed, reason:  , attempting to automatically
> reconnect
> java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(
> OpenWireFormat.java:258)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(
> TcpTransport.java:221)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(
> TcpTransport.java:213)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
>         at java.lang.Thread.run(Thread.java:745)
>
>
>
> > When you restart a client, how do you you stop the client (hard crash,
> > kill
> > -9, graceful shutdown, etc.) and how long do you wait before starting it
> > again? Could this be a situation where the broker simply hasn't
> discovered
> > that the previous TCP connection is dead?
> It's a graceful shutdown and client gets restarted quickly (~5 to 10
> seconds) after the stop.
>
>
>
>
> Thanks,
> Anuj
>
>
>
> --
> Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-
> f2341805.html
>

Re: "allowLinkStealing" as a client configuration

Posted by khandelwalanuj <an...@gmail.com>.
I can only see warning message in broker logs: 

[2017-09-16 14:15:59.540-0400] [ActiveMQ NIO Worker 3] 
[org.apache.activemq.broker.TransportConnection] [WARN] 
[TransportConnection.java:org.apache.activemq.broker.TransportConnection:processAddConnection()::858] 
[Failed to add Connection 
id=ID:app1a.prod.c.ia55.net-46199-1505585017494-3:2, 
clientId=integratedapp1a.prod.c.ia55.net_conn_listener due to 
javax.jms.InvalidClientIDException: Broker: mqbrokerbaly - Client: 
integratedapp1a.prod.c.ia55.net_conn_listener already *connected from null] 
* 

On the client side the exception received is : 
[ 2017-09-16 18:01:59,897 [EngineTriggerSide2MessageListenerContainer-1] []
listener.DefaultMessageListenerContainer.refreshConnectionUntilSuccessful():907 
ERROR]: Could 
not refresh JMS Connection for destination 'topic://RecNotification' -
retrying in 5000 ms. Cause: Broker: mqbroker- Client:
integratedapp1a.prod.c.ia55.net_conn_listener already connected from null
[ 2017-09-16 18:01:59,974 [ActiveMQ Transport:
tcp://mqbroker/*:*:*:*:61616@57384] []
failover.FailoverTransport.handleTransportFailure():260  WARN ]: Transport
(tcp://mqbrok
er/10.12.19.62:61616@57384) failed, reason:  , attempting to automatically
reconnect
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at
org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:258)
        at
org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:221)
        at
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:213)
        at
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
        at java.lang.Thread.run(Thread.java:745)



> When you restart a client, how do you you stop the client (hard crash,
> kill 
> -9, graceful shutdown, etc.) and how long do you wait before starting it 
> again? Could this be a situation where the broker simply hasn't discovered 
> that the previous TCP connection is dead?
It's a graceful shutdown and client gets restarted quickly (~5 to 10
seconds) after the stop.




Thanks,
Anuj



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by Tim Bain <tb...@alumni.duke.edu>.
Do you get a stack trace after that line in the logs? I spent a small
amount of time trying to figure out where that InvalidClientIDException is
thrown so I could look at why that null might be getting in there, but it
wasn't immediately obvious so a stack trace would help.

Note that the JRE will stop putting stack traces on the exception (because
computing them is expensive) after it encounters enough instances of the
same exception, so if there's no stack trace you may need to go back in the
logs to shortly after the broker was started (before the JRE had seen
enough instances of this exception to optimize away the stack trace), or to
restart the broker and trigger the exception by reconnecting a client.

When you restart a client, how do you you stop the client (hard crash, kill
-9, graceful shutdown, etc.) and how long do you wait before starting it
again? Could this be a situation where the broker simply hasn't discovered
that the previous TCP connection is dead?

Tim

On Sep 18, 2017 11:34 PM, "khandelwalanuj" <an...@gmail.com>
wrote:

Any reason why this is happening? It says connected from "NULL" ? Sounds
like
a bug.

Thanks,
Anuj



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by khandelwalanuj <an...@gmail.com>.
Any reason why this is happening? It says connected from "NULL" ? Sounds like
a bug. 

Thanks,
Anuj



--
Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html

Re: "allowLinkStealing" as a client configuration

Posted by Timothy Bish <ta...@gmail.com>.
On 09/18/2017 10:25 AM, khandelwalanuj wrote:
> Hi,
>
> I am using ActiveMQv5.14.
>
> We observed issues with durable java subscribers using TCP connector that
> lot of time after client/broker restart clients were not able to connect
> back to the broker and see below exception *"connected from null"*
>
> Lot of time after client restart, we get below exception in logs :
> [2017-09-16 14:15:59.540-0400] [ActiveMQ NIO Worker 3]
> [org.apache.activemq.broker.TransportConnection] [WARN]
> [TransportConnection.java:org.apache.activemq.broker.TransportConnection:processAddConnection()::858]
> [Failed to add Connection
> id=ID:app1a.prod.c.ia55.net-46199-1505585017494-3:2,
> clientId=integratedapp1a.prod.c.ia55.net_conn_listener due to
> javax.jms.InvalidClientIDException: Broker: mqbrokerbaly - Client:
> integratedapp1a.prod.c.ia55.net_conn_listener already *connected from null]
> *
>
> When we check in Jconsole, we don;t find any clients on the topic with the
> same client id.
>
> *To avoid this I wanted to use "allowLinkStealing" configuration in
> ActiveMQ, I found it working when I specify this in broker transport TCP
> configuration. But we don't want to enable it for all, Is there a way we can
> specify this from the client side instead of add in the broker connector
> configuration ?
> *
>
>
No, the broker needs to be configured to do that, client's cannot alter 
the link stealing configuration.


-- 
Tim Bish
twitter: @tabish121
blog: http://timbish.blogspot.com/