You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@activemq.apache.org by Tim Bain <tb...@alumni.duke.edu> on 2014/10/24 22:14:45 UTC

Re: Static: network connectors and maxReconnectAttempts

Gary,

I was able to get failover working between brokers in a network of brokers
using the maxReconnectAttempts=0 URI option.  However, when I tried adding
priorityBackup=true, I ran into problems.
http://activemq.2283324.n4.nabble.com/priorityBackup-not-supported-with-masterslave-td4677626.html#a4677945
and https://issues.apache.org/jira/browse/AMQ-4720 indicate that
priorityBackup won't work with broker-to-broker failover connections, for
the same reasons you and I went through last month.

For us, this makes failover networkConnectors (including ones using the
masterslave transport, since it's just chrome on top of failover) useless,
since our goal is to minimize the number of hops a message has to take and
the lack of fail-back behavior on the broker-to-broker connections
introduces an extra hop when messages continue going to the backup and then
have to be forwarded to the primary where all the clients now are.  For us,
a static mesh is better than a failover network that will have this
sub-optimal routing, so we'll stay with that until the static transport is
able to handle failover transports with the priorityBackup option enabled.

I looked in JIRA and couldn't find any enhancement request for making the
static transport handle this gracefully (your comments on
https://issues.apache.org/jira/browse/AMQ-4720 do indicate that that's what
would need to happen to fix that bug, but I think it's better as a
stand-alone request), so I submitted
https://issues.apache.org/jira/browse/AMQ-5411 to capture it.

But I think there's also a workaround that could be implemented: if
maxReconnects=0, when the priority connection is established following a
failover, the failover transport can kill both connections (the old one to
the backup broker and the new one to the priority broker), let the failure
bubble up to the static transport, and let it use the failover transport to
reconnect (to the priority URI, since it's now up).  I've submitted
https://issues.apache.org/jira/browse/AMQ-5412 to capture that workaround
request, in case doing the full rewrite described in AMQ-5411 isn't an
option in the near term.

Tim

On Mon, Sep 29, 2014 at 9:51 AM, Tim Bain <tb...@alumni.duke.edu> wrote:

> Sounds good; thanks for the explanation.
>
> On Mon, Sep 29, 2014 at 4:17 AM, Gary Tully <ga...@gmail.com> wrote:
>
>> everything is possible! but they evolved independently, hence the overlap
>> in functionality
>>
>> On 26 September 2014 16:02, Tim Bain <tb...@alumni.duke.edu> wrote:
>>
>> > Would it be possible for the failover transport to use the same
>> > DiscoveryListener mechanism that the static transport uses, but that's
>> just
>> > not how it's been implemented?  Or is there something fundamental about
>> why
>> > static is allowed to do its own reconnections (notifying the bridge via
>> the
>> > event handlers on the bridge's DiscoveryListener interface) but failover
>> > has to let connection failures bubble up to the bridge?
>> >
>> > Thanks for taking the time to clarify this, by the way.
>> >
>> > On Fri, Sep 26, 2014 at 4:14 AM, Gary Tully <ga...@gmail.com>
>> wrote:
>> >
>> > > the failover transport maintains a bunch of state -
>> > > connections/sessons/producers/consumers/transactions/messags/acks so
>> that
>> > > it can replay those to maintain and recreate the jms client view.
>> > > However, a netwok bridge is not a standard jms client - specifically
>> in
>> > the
>> > > duplex case but I think there potential issues in the non duplex case
>> > also.
>> > > So a failover reconnect will not guarantee that the network bridge is
>> > fully
>> > > functional. The bridge needs to be stopped and restarted to
>> successfully
>> > > cleanup and resume.
>> > > In other words, the network bridge needs to be aware of transport
>> > failures
>> > > as they occur. The intent of the failover: transport is to hide those.
>> > >
>> > >
>> > >
>> > > On 25 September 2014 19:37, Tim Bain <tb...@alumni.duke.edu> wrote:
>> > >
>> > > > Based on the comments that you and Torsten made in the links from my
>> > > first
>> > > > message, I had understood that for networkConnectors between
>> brokers,
>> > you
>> > > > should not allow the discovery transport to perform reconnects,
>> because
>> > > it
>> > > > was important for the network bridge to be notified of the
>> > disconnection
>> > > > and reconnection.  You said that that happens automatically for
>> static
>> > > > discovery transports (and I see the onServiceAdd() and
>> > onServiceRemove()
>> > > > methods in NetworkDiscoveryConnector that would handle those
>> events),
>> > but
>> > > > what's different about failover that makes the same
>> DiscoveryListener
>> > > > mechanism not work?
>> > > >
>> > > > On Thu, Sep 25, 2014 at 9:21 AM, Gary Tully <ga...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > maxReconnectAttempts=0 relates to the use of failover only, where
>> you
>> > > use
>> > > > > failover to choose between a list of broker urls (typically a pair
>> > for
>> > > > > master slave). masterSlave sets maxReconnectAttempts=0 on the
>> > > underlying
>> > > > > failover url.
>> > > > > The static discovery, which is implemented by the
>> > SimpleDiscoveryAgent
>> > > > can
>> > > > > do retries and backoff etc.
>> > > > > see:
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/activemq/blob/d54e0d6ab590b6a6148a5e2629c45b95d3f40eb8/activemq-client/src/main/java/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#L42
>> > > > >
>> > > > > The network bridge is a discovery listener, it gets told to
>> > add/remove
>> > > > > services (urls) that are discovered/retried.
>> > > > >
>> > > > >
>> > > > > On 24 September 2014 20:20, Tim Bain <tb...@alumni.duke.edu>
>> wrote:
>> > > > >
>> > > > > > Gary, Torsten, and others have said in various places that
>> > > > > broker-to-broker
>> > > > > > networkConnectors should set maxReconnectAttempts=0 to allow
>> > > > reconnection
>> > > > > > to be handled by the network bridge.  (Sources: 1
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://tmielke.blogspot.com/2011/09/activemq-network-bridge-to-masterslave.html
>> > > > > > >,
>> > > > > > 2
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://activemq.2283324.n4.nabble.com/Persistent-messages-disappearing-td4681353.html
>> > > > > > >,
>> > > > > > 3
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://grokbase.com/t/activemq/users/1427v9eqkf/prioritybackup-not-supported-with-masterslave
>> > > > > > >)
>> > > > > > Torsten (link 1) was talking about static: network connectors,
>> > while
>> > > > > Gary's
>> > > > > > quotes in the other two links were related to failover: (or
>> > > > masterslave:,
>> > > > > > which is just chrome on top of failover:), but if it's a
>> > requirement
>> > > of
>> > > > > the
>> > > > > > network bridge that it be the one to re-establish the question,
>> it
>> > > > > > shouldn't matter what the underlying transport is.
>> > > > > >
>> > > > > > It's obvious in FailoverTransport
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-client/5.10.0/org/apache/activemq/transport/failover/FailoverTransport.java#FailoverTransport
>> > > > > > >
>> > > > > > how maxReconnectAttempts=0 gets processed to mean "don't try to
>> > > > > reconnect",
>> > > > > > allowing the network bridge to re-establish the connection, and
>> > there
>> > > > are
>> > > > > > notes in
>> > > http://activemq.apache.org/failover-transport-reference.html
>> > > > > > explaining that this interpretation of the value "0" was
>> > implemented
>> > > in
>> > > > > > 5.6.0 (https://issues.apache.org/jira/browse/AMQ-3542).
>> There's
>> > no
>> > > > > > similar
>> > > > > > code in SimpleDiscoveryAgent
>> > > > > > <
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-all/5.10.0/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#SimpleDiscoveryAgent
>> > > > > > >
>> > > > > > (which handles connection attempts for the static: transport
>> > > > > > <http://activemq.apache.org/static-transport-reference.html>,
>> as I
>> > > > > > understand it) to interpret "-1" as "reconnect forever" and "0"
>> as
>> > > > "don't
>> > > > > > reconnect".
>> > > > > >
>> > > > > > Is Gary's and Torsten's advice about maxReconnectAttempts not
>> > > > applicable
>> > > > > to
>> > > > > > static: network connectors for some reason that I'm not
>> > > understanding?
>> > > > > Or
>> > > > > > should the changes Gary made in AMQ-3542 have been applied to
>> all
>> > > > > protocols
>> > > > > > that include reconnection attempts?  (Do I need to open a JIRA
>> for
>> > > > this?)
>> > > > > >
>> > > > > > And a related question: when using the static: transport to
>> > > establish a
>> > > > > > broker mesh, if we set maxReconnectAttempts=0, is there a way to
>> > > > perform
>> > > > > > exponential backoff at the network bridge, so it doesn't
>> > continually
>> > > > try
>> > > > > to
>> > > > > > reconnect (and spam the logs) when one broker in the mesh is
>> > offline
>> > > > for
>> > > > > a
>> > > > > > while?  The only way I see to control exponential backoff is
>> within
>> > > the
>> > > > > > static: transport via the useExponentialBackOff=true option;
>> > > searching
>> > > > > the
>> > > > > > source code (I'm looking at 5.8.0), I don't see any references
>> to
>> > > > > > exponential backoff in any code that seems to be related to
>> network
>> > > > > > bridges...
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Tim
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > http://redhat.com
>> > > > > http://blog.garytully.com
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > http://redhat.com
>> > > http://blog.garytully.com
>> > >
>> >
>>
>>
>>
>> --
>> http://redhat.com
>> http://blog.garytully.com
>>
>
>

Re: Static: network connectors and maxReconnectAttempts

Posted by Gary Tully <ga...@gmail.com>.

In this case static: will do what you want.

On 30 October 2014 16:52, Tim Bain <tb...@alumni.duke.edu> wrote:

> What you're suggesting sounds like it would behave the same as if I use the
> static: transport with two nested URIs, except that I'd have more
> configuration boilerplate since I'd have an additional networkConnector on
> which I'd have to set all of my config options.  Do I get something from
> using the two networkConnectors that I don't get from using a static:
> transport with two nested URIs?  If not, I'll stay with the simpler config,
> until priorityBackup works with networkConnectors.
>
> On Thu, Oct 30, 2014 at 5:53 AM, Gary Tully <ga...@gmail.com> wrote:
>
> > you may be able to achieve what you want with two network connectors. one
> > to the primary and one to the backup. When both are alive and you use
> > decreaseNetworkConsumerPriority and priority dispatch, messages should
> take
> > the shortest path (priority decreases with number of hops).
> >
> > On 24 October 2014 21:14, Tim Bain <tb...@alumni.duke.edu> wrote:
> >
> > > Gary,
> > >
> > > I was able to get failover working between brokers in a network of
> > brokers
> > > using the maxReconnectAttempts=0 URI option.  However, when I tried
> > adding
> > > priorityBackup=true, I ran into problems.
> > >
> > >
> >
> http://activemq.2283324.n4.nabble.com/priorityBackup-not-supported-with-masterslave-td4677626.html#a4677945
> > > and https://issues.apache.org/jira/browse/AMQ-4720 indicate that
> > > priorityBackup won't work with broker-to-broker failover connections,
> for
> > > the same reasons you and I went through last month.
> > >
> > > For us, this makes failover networkConnectors (including ones using the
> > > masterslave transport, since it's just chrome on top of failover)
> > useless,
> > > since our goal is to minimize the number of hops a message has to take
> > and
> > > the lack of fail-back behavior on the broker-to-broker connections
> > > introduces an extra hop when messages continue going to the backup and
> > then
> > > have to be forwarded to the primary where all the clients now are.  For
> > us,
> > > a static mesh is better than a failover network that will have this
> > > sub-optimal routing, so we'll stay with that until the static transport
> > is
> > > able to handle failover transports with the priorityBackup option
> > enabled.
> > >
> > > I looked in JIRA and couldn't find any enhancement request for making
> the
> > > static transport handle this gracefully (your comments on
> > > https://issues.apache.org/jira/browse/AMQ-4720 do indicate that that's
> > > what
> > > would need to happen to fix that bug, but I think it's better as a
> > > stand-alone request), so I submitted
> > > https://issues.apache.org/jira/browse/AMQ-5411 to capture it.
> > >
> > > But I think there's also a workaround that could be implemented: if
> > > maxReconnects=0, when the priority connection is established following
> a
> > > failover, the failover transport can kill both connections (the old one
> > to
> > > the backup broker and the new one to the priority broker), let the
> > failure
> > > bubble up to the static transport, and let it use the failover
> transport
> > to
> > > reconnect (to the priority URI, since it's now up).  I've submitted
> > > https://issues.apache.org/jira/browse/AMQ-5412 to capture that
> > workaround
> > > request, in case doing the full rewrite described in AMQ-5411 isn't an
> > > option in the near term.
> > >
> > > Tim
> > >
> > > On Mon, Sep 29, 2014 at 9:51 AM, Tim Bain <tb...@alumni.duke.edu>
> wrote:
> > >
> > > > Sounds good; thanks for the explanation.
> > > >
> > > > On Mon, Sep 29, 2014 at 4:17 AM, Gary Tully <ga...@gmail.com>
> > > wrote:
> > > >
> > > >> everything is possible! but they evolved independently, hence the
> > > overlap
> > > >> in functionality
> > > >>
> > > >> On 26 September 2014 16:02, Tim Bain <tb...@alumni.duke.edu> wrote:
> > > >>
> > > >> > Would it be possible for the failover transport to use the same
> > > >> > DiscoveryListener mechanism that the static transport uses, but
> > that's
> > > >> just
> > > >> > not how it's been implemented?  Or is there something fundamental
> > > about
> > > >> why
> > > >> > static is allowed to do its own reconnections (notifying the
> bridge
> > > via
> > > >> the
> > > >> > event handlers on the bridge's DiscoveryListener interface) but
> > > failover
> > > >> > has to let connection failures bubble up to the bridge?
> > > >> >
> > > >> > Thanks for taking the time to clarify this, by the way.
> > > >> >
> > > >> > On Fri, Sep 26, 2014 at 4:14 AM, Gary Tully <gary.tully@gmail.com
> >
> > > >> wrote:
> > > >> >
> > > >> > > the failover transport maintains a bunch of state -
> > > >> > >
> connections/sessons/producers/consumers/transactions/messags/acks
> > so
> > > >> that
> > > >> > > it can replay those to maintain and recreate the jms client
> view.
> > > >> > > However, a netwok bridge is not a standard jms client -
> > specifically
> > > >> in
> > > >> > the
> > > >> > > duplex case but I think there potential issues in the non duplex
> > > case
> > > >> > also.
> > > >> > > So a failover reconnect will not guarantee that the network
> bridge
> > > is
> > > >> > fully
> > > >> > > functional. The bridge needs to be stopped and restarted to
> > > >> successfully
> > > >> > > cleanup and resume.
> > > >> > > In other words, the network bridge needs to be aware of
> transport
> > > >> > failures
> > > >> > > as they occur. The intent of the failover: transport is to hide
> > > those.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On 25 September 2014 19:37, Tim Bain <tb...@alumni.duke.edu>
> > wrote:
> > > >> > >
> > > >> > > > Based on the comments that you and Torsten made in the links
> > from
> > > my
> > > >> > > first
> > > >> > > > message, I had understood that for networkConnectors between
> > > >> brokers,
> > > >> > you
> > > >> > > > should not allow the discovery transport to perform
> reconnects,
> > > >> because
> > > >> > > it
> > > >> > > > was important for the network bridge to be notified of the
> > > >> > disconnection
> > > >> > > > and reconnection.  You said that that happens automatically
> for
> > > >> static
> > > >> > > > discovery transports (and I see the onServiceAdd() and
> > > >> > onServiceRemove()
> > > >> > > > methods in NetworkDiscoveryConnector that would handle those
> > > >> events),
> > > >> > but
> > > >> > > > what's different about failover that makes the same
> > > >> DiscoveryListener
> > > >> > > > mechanism not work?
> > > >> > > >
> > > >> > > > On Thu, Sep 25, 2014 at 9:21 AM, Gary Tully <
> > gary.tully@gmail.com
> > > >
> > > >> > > wrote:
> > > >> > > >
> > > >> > > > > maxReconnectAttempts=0 relates to the use of failover only,
> > > where
> > > >> you
> > > >> > > use
> > > >> > > > > failover to choose between a list of broker urls (typically
> a
> > > pair
> > > >> > for
> > > >> > > > > master slave). masterSlave sets maxReconnectAttempts=0 on
> the
> > > >> > > underlying
> > > >> > > > > failover url.
> > > >> > > > > The static discovery, which is implemented by the
> > > >> > SimpleDiscoveryAgent
> > > >> > > > can
> > > >> > > > > do retries and backoff etc.
> > > >> > > > > see:
> > > >> > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/activemq/blob/d54e0d6ab590b6a6148a5e2629c45b95d3f40eb8/activemq-client/src/main/java/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#L42
> > > >> > > > >
> > > >> > > > > The network bridge is a discovery listener, it gets told to
> > > >> > add/remove
> > > >> > > > > services (urls) that are discovered/retried.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > On 24 September 2014 20:20, Tim Bain <tbain@alumni.duke.edu
> >
> > > >> wrote:
> > > >> > > > >
> > > >> > > > > > Gary, Torsten, and others have said in various places that
> > > >> > > > > broker-to-broker
> > > >> > > > > > networkConnectors should set maxReconnectAttempts=0 to
> allow
> > > >> > > > reconnection
> > > >> > > > > > to be handled by the network bridge.  (Sources: 1
> > > >> > > > > > <
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://tmielke.blogspot.com/2011/09/activemq-network-bridge-to-masterslave.html
> > > >> > > > > > >,
> > > >> > > > > > 2
> > > >> > > > > > <
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://activemq.2283324.n4.nabble.com/Persistent-messages-disappearing-td4681353.html
> > > >> > > > > > >,
> > > >> > > > > > 3
> > > >> > > > > > <
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://grokbase.com/t/activemq/users/1427v9eqkf/prioritybackup-not-supported-with-masterslave
> > > >> > > > > > >)
> > > >> > > > > > Torsten (link 1) was talking about static: network
> > connectors,
> > > >> > while
> > > >> > > > > Gary's
> > > >> > > > > > quotes in the other two links were related to failover:
> (or
> > > >> > > > masterslave:,
> > > >> > > > > > which is just chrome on top of failover:), but if it's a
> > > >> > requirement
> > > >> > > of
> > > >> > > > > the
> > > >> > > > > > network bridge that it be the one to re-establish the
> > > question,
> > > >> it
> > > >> > > > > > shouldn't matter what the underlying transport is.
> > > >> > > > > >
> > > >> > > > > > It's obvious in FailoverTransport
> > > >> > > > > > <
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-client/5.10.0/org/apache/activemq/transport/failover/FailoverTransport.java#FailoverTransport
> > > >> > > > > > >
> > > >> > > > > > how maxReconnectAttempts=0 gets processed to mean "don't
> try
> > > to
> > > >> > > > > reconnect",
> > > >> > > > > > allowing the network bridge to re-establish the
> connection,
> > > and
> > > >> > there
> > > >> > > > are
> > > >> > > > > > notes in
> > > >> > > http://activemq.apache.org/failover-transport-reference.html
> > > >> > > > > > explaining that this interpretation of the value "0" was
> > > >> > implemented
> > > >> > > in
> > > >> > > > > > 5.6.0 (https://issues.apache.org/jira/browse/AMQ-3542).
> > > >> There's
> > > >> > no
> > > >> > > > > > similar
> > > >> > > > > > code in SimpleDiscoveryAgent
> > > >> > > > > > <
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-all/5.10.0/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#SimpleDiscoveryAgent
> > > >> > > > > > >
> > > >> > > > > > (which handles connection attempts for the static:
> transport
> > > >> > > > > > <
> http://activemq.apache.org/static-transport-reference.html
> > >,
> > > >> as I
> > > >> > > > > > understand it) to interpret "-1" as "reconnect forever"
> and
> > > "0"
> > > >> as
> > > >> > > > "don't
> > > >> > > > > > reconnect".
> > > >> > > > > >
> > > >> > > > > > Is Gary's and Torsten's advice about maxReconnectAttempts
> > not
> > > >> > > > applicable
> > > >> > > > > to
> > > >> > > > > > static: network connectors for some reason that I'm not
> > > >> > > understanding?
> > > >> > > > > Or
> > > >> > > > > > should the changes Gary made in AMQ-3542 have been applied
> > to
> > > >> all
> > > >> > > > > protocols
> > > >> > > > > > that include reconnection attempts?  (Do I need to open a
> > JIRA
> > > >> for
> > > >> > > > this?)
> > > >> > > > > >
> > > >> > > > > > And a related question: when using the static: transport
> to
> > > >> > > establish a
> > > >> > > > > > broker mesh, if we set maxReconnectAttempts=0, is there a
> > way
> > > to
> > > >> > > > perform
> > > >> > > > > > exponential backoff at the network bridge, so it doesn't
> > > >> > continually
> > > >> > > > try
> > > >> > > > > to
> > > >> > > > > > reconnect (and spam the logs) when one broker in the mesh
> is
> > > >> > offline
> > > >> > > > for
> > > >> > > > > a
> > > >> > > > > > while?  The only way I see to control exponential backoff
> is
> > > >> within
> > > >> > > the
> > > >> > > > > > static: transport via the useExponentialBackOff=true
> option;
> > > >> > > searching
> > > >> > > > > the
> > > >> > > > > > source code (I'm looking at 5.8.0), I don't see any
> > references
> > > >> to
> > > >> > > > > > exponential backoff in any code that seems to be related
> to
> > > >> network
> > > >> > > > > > bridges...
> > > >> > > > > >
> > > >> > > > > > Thanks,
> > > >> > > > > > Tim
> > > >> > > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > --
> > > >> > > > > http://redhat.com
> > > >> > > > > http://blog.garytully.com
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > http://redhat.com
> > > >> > > http://blog.garytully.com
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> http://redhat.com
> > > >> http://blog.garytully.com
> > > >>
> > > >
> > > >
> > >
> >
>

Re: Static: network connectors and maxReconnectAttempts

Posted by Tim Bain <tb...@alumni.duke.edu>.

What you're suggesting sounds like it would behave the same as if I use the
static: transport with two nested URIs, except that I'd have more
configuration boilerplate since I'd have an additional networkConnector on
which I'd have to set all of my config options.  Do I get something from
using the two networkConnectors that I don't get from using a static:
transport with two nested URIs?  If not, I'll stay with the simpler config,
until priorityBackup works with networkConnectors.

On Thu, Oct 30, 2014 at 5:53 AM, Gary Tully <ga...@gmail.com> wrote:

> you may be able to achieve what you want with two network connectors. one
> to the primary and one to the backup. When both are alive and you use
> decreaseNetworkConsumerPriority and priority dispatch, messages should take
> the shortest path (priority decreases with number of hops).
>
> On 24 October 2014 21:14, Tim Bain <tb...@alumni.duke.edu> wrote:
>
> > Gary,
> >
> > I was able to get failover working between brokers in a network of
> brokers
> > using the maxReconnectAttempts=0 URI option.  However, when I tried
> adding
> > priorityBackup=true, I ran into problems.
> >
> >
> http://activemq.2283324.n4.nabble.com/priorityBackup-not-supported-with-masterslave-td4677626.html#a4677945
> > and https://issues.apache.org/jira/browse/AMQ-4720 indicate that
> > priorityBackup won't work with broker-to-broker failover connections, for
> > the same reasons you and I went through last month.
> >
> > For us, this makes failover networkConnectors (including ones using the
> > masterslave transport, since it's just chrome on top of failover)
> useless,
> > since our goal is to minimize the number of hops a message has to take
> and
> > the lack of fail-back behavior on the broker-to-broker connections
> > introduces an extra hop when messages continue going to the backup and
> then
> > have to be forwarded to the primary where all the clients now are.  For
> us,
> > a static mesh is better than a failover network that will have this
> > sub-optimal routing, so we'll stay with that until the static transport
> is
> > able to handle failover transports with the priorityBackup option
> enabled.
> >
> > I looked in JIRA and couldn't find any enhancement request for making the
> > static transport handle this gracefully (your comments on
> > https://issues.apache.org/jira/browse/AMQ-4720 do indicate that that's
> > what
> > would need to happen to fix that bug, but I think it's better as a
> > stand-alone request), so I submitted
> > https://issues.apache.org/jira/browse/AMQ-5411 to capture it.
> >
> > But I think there's also a workaround that could be implemented: if
> > maxReconnects=0, when the priority connection is established following a
> > failover, the failover transport can kill both connections (the old one
> to
> > the backup broker and the new one to the priority broker), let the
> failure
> > bubble up to the static transport, and let it use the failover transport
> to
> > reconnect (to the priority URI, since it's now up).  I've submitted
> > https://issues.apache.org/jira/browse/AMQ-5412 to capture that
> workaround
> > request, in case doing the full rewrite described in AMQ-5411 isn't an
> > option in the near term.
> >
> > Tim
> >
> > On Mon, Sep 29, 2014 at 9:51 AM, Tim Bain <tb...@alumni.duke.edu> wrote:
> >
> > > Sounds good; thanks for the explanation.
> > >
> > > On Mon, Sep 29, 2014 at 4:17 AM, Gary Tully <ga...@gmail.com>
> > wrote:
> > >
> > >> everything is possible! but they evolved independently, hence the
> > overlap
> > >> in functionality
> > >>
> > >> On 26 September 2014 16:02, Tim Bain <tb...@alumni.duke.edu> wrote:
> > >>
> > >> > Would it be possible for the failover transport to use the same
> > >> > DiscoveryListener mechanism that the static transport uses, but
> that's
> > >> just
> > >> > not how it's been implemented?  Or is there something fundamental
> > about
> > >> why
> > >> > static is allowed to do its own reconnections (notifying the bridge
> > via
> > >> the
> > >> > event handlers on the bridge's DiscoveryListener interface) but
> > failover
> > >> > has to let connection failures bubble up to the bridge?
> > >> >
> > >> > Thanks for taking the time to clarify this, by the way.
> > >> >
> > >> > On Fri, Sep 26, 2014 at 4:14 AM, Gary Tully <ga...@gmail.com>
> > >> wrote:
> > >> >
> > >> > > the failover transport maintains a bunch of state -
> > >> > > connections/sessons/producers/consumers/transactions/messags/acks
> so
> > >> that
> > >> > > it can replay those to maintain and recreate the jms client view.
> > >> > > However, a netwok bridge is not a standard jms client -
> specifically
> > >> in
> > >> > the
> > >> > > duplex case but I think there potential issues in the non duplex
> > case
> > >> > also.
> > >> > > So a failover reconnect will not guarantee that the network bridge
> > is
> > >> > fully
> > >> > > functional. The bridge needs to be stopped and restarted to
> > >> successfully
> > >> > > cleanup and resume.
> > >> > > In other words, the network bridge needs to be aware of transport
> > >> > failures
> > >> > > as they occur. The intent of the failover: transport is to hide
> > those.
> > >> > >
> > >> > >
> > >> > >
> > >> > > On 25 September 2014 19:37, Tim Bain <tb...@alumni.duke.edu>
> wrote:
> > >> > >
> > >> > > > Based on the comments that you and Torsten made in the links
> from
> > my
> > >> > > first
> > >> > > > message, I had understood that for networkConnectors between
> > >> brokers,
> > >> > you
> > >> > > > should not allow the discovery transport to perform reconnects,
> > >> because
> > >> > > it
> > >> > > > was important for the network bridge to be notified of the
> > >> > disconnection
> > >> > > > and reconnection.  You said that that happens automatically for
> > >> static
> > >> > > > discovery transports (and I see the onServiceAdd() and
> > >> > onServiceRemove()
> > >> > > > methods in NetworkDiscoveryConnector that would handle those
> > >> events),
> > >> > but
> > >> > > > what's different about failover that makes the same
> > >> DiscoveryListener
> > >> > > > mechanism not work?
> > >> > > >
> > >> > > > On Thu, Sep 25, 2014 at 9:21 AM, Gary Tully <
> gary.tully@gmail.com
> > >
> > >> > > wrote:
> > >> > > >
> > >> > > > > maxReconnectAttempts=0 relates to the use of failover only,
> > where
> > >> you
> > >> > > use
> > >> > > > > failover to choose between a list of broker urls (typically a
> > pair
> > >> > for
> > >> > > > > master slave). masterSlave sets maxReconnectAttempts=0 on the
> > >> > > underlying
> > >> > > > > failover url.
> > >> > > > > The static discovery, which is implemented by the
> > >> > SimpleDiscoveryAgent
> > >> > > > can
> > >> > > > > do retries and backoff etc.
> > >> > > > > see:
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://github.com/apache/activemq/blob/d54e0d6ab590b6a6148a5e2629c45b95d3f40eb8/activemq-client/src/main/java/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#L42
> > >> > > > >
> > >> > > > > The network bridge is a discovery listener, it gets told to
> > >> > add/remove
> > >> > > > > services (urls) that are discovered/retried.
> > >> > > > >
> > >> > > > >
> > >> > > > > On 24 September 2014 20:20, Tim Bain <tb...@alumni.duke.edu>
> > >> wrote:
> > >> > > > >
> > >> > > > > > Gary, Torsten, and others have said in various places that
> > >> > > > > broker-to-broker
> > >> > > > > > networkConnectors should set maxReconnectAttempts=0 to allow
> > >> > > > reconnection
> > >> > > > > > to be handled by the network bridge.  (Sources: 1
> > >> > > > > > <
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://tmielke.blogspot.com/2011/09/activemq-network-bridge-to-masterslave.html
> > >> > > > > > >,
> > >> > > > > > 2
> > >> > > > > > <
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://activemq.2283324.n4.nabble.com/Persistent-messages-disappearing-td4681353.html
> > >> > > > > > >,
> > >> > > > > > 3
> > >> > > > > > <
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://grokbase.com/t/activemq/users/1427v9eqkf/prioritybackup-not-supported-with-masterslave
> > >> > > > > > >)
> > >> > > > > > Torsten (link 1) was talking about static: network
> connectors,
> > >> > while
> > >> > > > > Gary's
> > >> > > > > > quotes in the other two links were related to failover: (or
> > >> > > > masterslave:,
> > >> > > > > > which is just chrome on top of failover:), but if it's a
> > >> > requirement
> > >> > > of
> > >> > > > > the
> > >> > > > > > network bridge that it be the one to re-establish the
> > question,
> > >> it
> > >> > > > > > shouldn't matter what the underlying transport is.
> > >> > > > > >
> > >> > > > > > It's obvious in FailoverTransport
> > >> > > > > > <
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-client/5.10.0/org/apache/activemq/transport/failover/FailoverTransport.java#FailoverTransport
> > >> > > > > > >
> > >> > > > > > how maxReconnectAttempts=0 gets processed to mean "don't try
> > to
> > >> > > > > reconnect",
> > >> > > > > > allowing the network bridge to re-establish the connection,
> > and
> > >> > there
> > >> > > > are
> > >> > > > > > notes in
> > >> > > http://activemq.apache.org/failover-transport-reference.html
> > >> > > > > > explaining that this interpretation of the value "0" was
> > >> > implemented
> > >> > > in
> > >> > > > > > 5.6.0 (https://issues.apache.org/jira/browse/AMQ-3542).
> > >> There's
> > >> > no
> > >> > > > > > similar
> > >> > > > > > code in SimpleDiscoveryAgent
> > >> > > > > > <
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-all/5.10.0/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#SimpleDiscoveryAgent
> > >> > > > > > >
> > >> > > > > > (which handles connection attempts for the static: transport
> > >> > > > > > <http://activemq.apache.org/static-transport-reference.html
> >,
> > >> as I
> > >> > > > > > understand it) to interpret "-1" as "reconnect forever" and
> > "0"
> > >> as
> > >> > > > "don't
> > >> > > > > > reconnect".
> > >> > > > > >
> > >> > > > > > Is Gary's and Torsten's advice about maxReconnectAttempts
> not
> > >> > > > applicable
> > >> > > > > to
> > >> > > > > > static: network connectors for some reason that I'm not
> > >> > > understanding?
> > >> > > > > Or
> > >> > > > > > should the changes Gary made in AMQ-3542 have been applied
> to
> > >> all
> > >> > > > > protocols
> > >> > > > > > that include reconnection attempts?  (Do I need to open a
> JIRA
> > >> for
> > >> > > > this?)
> > >> > > > > >
> > >> > > > > > And a related question: when using the static: transport to
> > >> > > establish a
> > >> > > > > > broker mesh, if we set maxReconnectAttempts=0, is there a
> way
> > to
> > >> > > > perform
> > >> > > > > > exponential backoff at the network bridge, so it doesn't
> > >> > continually
> > >> > > > try
> > >> > > > > to
> > >> > > > > > reconnect (and spam the logs) when one broker in the mesh is
> > >> > offline
> > >> > > > for
> > >> > > > > a
> > >> > > > > > while?  The only way I see to control exponential backoff is
> > >> within
> > >> > > the
> > >> > > > > > static: transport via the useExponentialBackOff=true option;
> > >> > > searching
> > >> > > > > the
> > >> > > > > > source code (I'm looking at 5.8.0), I don't see any
> references
> > >> to
> > >> > > > > > exponential backoff in any code that seems to be related to
> > >> network
> > >> > > > > > bridges...
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > Tim
> > >> > > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > --
> > >> > > > > http://redhat.com
> > >> > > > > http://blog.garytully.com
> > >> > > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > http://redhat.com
> > >> > > http://blog.garytully.com
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> http://redhat.com
> > >> http://blog.garytully.com
> > >>
> > >
> > >
> >
>

Re: Static: network connectors and maxReconnectAttempts

Posted by Gary Tully <ga...@gmail.com>.

you may be able to achieve what you want with two network connectors. one
to the primary and one to the backup. When both are alive and you use
decreaseNetworkConsumerPriority and priority dispatch, messages should take
the shortest path (priority decreases with number of hops).

On 24 October 2014 21:14, Tim Bain <tb...@alumni.duke.edu> wrote:

> Gary,
>
> I was able to get failover working between brokers in a network of brokers
> using the maxReconnectAttempts=0 URI option.  However, when I tried adding
> priorityBackup=true, I ran into problems.
>
> http://activemq.2283324.n4.nabble.com/priorityBackup-not-supported-with-masterslave-td4677626.html#a4677945
> and https://issues.apache.org/jira/browse/AMQ-4720 indicate that
> priorityBackup won't work with broker-to-broker failover connections, for
> the same reasons you and I went through last month.
>
> For us, this makes failover networkConnectors (including ones using the
> masterslave transport, since it's just chrome on top of failover) useless,
> since our goal is to minimize the number of hops a message has to take and
> the lack of fail-back behavior on the broker-to-broker connections
> introduces an extra hop when messages continue going to the backup and then
> have to be forwarded to the primary where all the clients now are.  For us,
> a static mesh is better than a failover network that will have this
> sub-optimal routing, so we'll stay with that until the static transport is
> able to handle failover transports with the priorityBackup option enabled.
>
> I looked in JIRA and couldn't find any enhancement request for making the
> static transport handle this gracefully (your comments on
> https://issues.apache.org/jira/browse/AMQ-4720 do indicate that that's
> what
> would need to happen to fix that bug, but I think it's better as a
> stand-alone request), so I submitted
> https://issues.apache.org/jira/browse/AMQ-5411 to capture it.
>
> But I think there's also a workaround that could be implemented: if
> maxReconnects=0, when the priority connection is established following a
> failover, the failover transport can kill both connections (the old one to
> the backup broker and the new one to the priority broker), let the failure
> bubble up to the static transport, and let it use the failover transport to
> reconnect (to the priority URI, since it's now up).  I've submitted
> https://issues.apache.org/jira/browse/AMQ-5412 to capture that workaround
> request, in case doing the full rewrite described in AMQ-5411 isn't an
> option in the near term.
>
> Tim
>
> On Mon, Sep 29, 2014 at 9:51 AM, Tim Bain <tb...@alumni.duke.edu> wrote:
>
> > Sounds good; thanks for the explanation.
> >
> > On Mon, Sep 29, 2014 at 4:17 AM, Gary Tully <ga...@gmail.com>
> wrote:
> >
> >> everything is possible! but they evolved independently, hence the
> overlap
> >> in functionality
> >>
> >> On 26 September 2014 16:02, Tim Bain <tb...@alumni.duke.edu> wrote:
> >>
> >> > Would it be possible for the failover transport to use the same
> >> > DiscoveryListener mechanism that the static transport uses, but that's
> >> just
> >> > not how it's been implemented?  Or is there something fundamental
> about
> >> why
> >> > static is allowed to do its own reconnections (notifying the bridge
> via
> >> the
> >> > event handlers on the bridge's DiscoveryListener interface) but
> failover
> >> > has to let connection failures bubble up to the bridge?
> >> >
> >> > Thanks for taking the time to clarify this, by the way.
> >> >
> >> > On Fri, Sep 26, 2014 at 4:14 AM, Gary Tully <ga...@gmail.com>
> >> wrote:
> >> >
> >> > > the failover transport maintains a bunch of state -
> >> > > connections/sessons/producers/consumers/transactions/messags/acks so
> >> that
> >> > > it can replay those to maintain and recreate the jms client view.
> >> > > However, a netwok bridge is not a standard jms client - specifically
> >> in
> >> > the
> >> > > duplex case but I think there potential issues in the non duplex
> case
> >> > also.
> >> > > So a failover reconnect will not guarantee that the network bridge
> is
> >> > fully
> >> > > functional. The bridge needs to be stopped and restarted to
> >> successfully
> >> > > cleanup and resume.
> >> > > In other words, the network bridge needs to be aware of transport
> >> > failures
> >> > > as they occur. The intent of the failover: transport is to hide
> those.
> >> > >
> >> > >
> >> > >
> >> > > On 25 September 2014 19:37, Tim Bain <tb...@alumni.duke.edu> wrote:
> >> > >
> >> > > > Based on the comments that you and Torsten made in the links from
> my
> >> > > first
> >> > > > message, I had understood that for networkConnectors between
> >> brokers,
> >> > you
> >> > > > should not allow the discovery transport to perform reconnects,
> >> because
> >> > > it
> >> > > > was important for the network bridge to be notified of the
> >> > disconnection
> >> > > > and reconnection.  You said that that happens automatically for
> >> static
> >> > > > discovery transports (and I see the onServiceAdd() and
> >> > onServiceRemove()
> >> > > > methods in NetworkDiscoveryConnector that would handle those
> >> events),
> >> > but
> >> > > > what's different about failover that makes the same
> >> DiscoveryListener
> >> > > > mechanism not work?
> >> > > >
> >> > > > On Thu, Sep 25, 2014 at 9:21 AM, Gary Tully <gary.tully@gmail.com
> >
> >> > > wrote:
> >> > > >
> >> > > > > maxReconnectAttempts=0 relates to the use of failover only,
> where
> >> you
> >> > > use
> >> > > > > failover to choose between a list of broker urls (typically a
> pair
> >> > for
> >> > > > > master slave). masterSlave sets maxReconnectAttempts=0 on the
> >> > > underlying
> >> > > > > failover url.
> >> > > > > The static discovery, which is implemented by the
> >> > SimpleDiscoveryAgent
> >> > > > can
> >> > > > > do retries and backoff etc.
> >> > > > > see:
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/activemq/blob/d54e0d6ab590b6a6148a5e2629c45b95d3f40eb8/activemq-client/src/main/java/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#L42
> >> > > > >
> >> > > > > The network bridge is a discovery listener, it gets told to
> >> > add/remove
> >> > > > > services (urls) that are discovered/retried.
> >> > > > >
> >> > > > >
> >> > > > > On 24 September 2014 20:20, Tim Bain <tb...@alumni.duke.edu>
> >> wrote:
> >> > > > >
> >> > > > > > Gary, Torsten, and others have said in various places that
> >> > > > > broker-to-broker
> >> > > > > > networkConnectors should set maxReconnectAttempts=0 to allow
> >> > > > reconnection
> >> > > > > > to be handled by the network bridge.  (Sources: 1
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://tmielke.blogspot.com/2011/09/activemq-network-bridge-to-masterslave.html
> >> > > > > > >,
> >> > > > > > 2
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://activemq.2283324.n4.nabble.com/Persistent-messages-disappearing-td4681353.html
> >> > > > > > >,
> >> > > > > > 3
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://grokbase.com/t/activemq/users/1427v9eqkf/prioritybackup-not-supported-with-masterslave
> >> > > > > > >)
> >> > > > > > Torsten (link 1) was talking about static: network connectors,
> >> > while
> >> > > > > Gary's
> >> > > > > > quotes in the other two links were related to failover: (or
> >> > > > masterslave:,
> >> > > > > > which is just chrome on top of failover:), but if it's a
> >> > requirement
> >> > > of
> >> > > > > the
> >> > > > > > network bridge that it be the one to re-establish the
> question,
> >> it
> >> > > > > > shouldn't matter what the underlying transport is.
> >> > > > > >
> >> > > > > > It's obvious in FailoverTransport
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-client/5.10.0/org/apache/activemq/transport/failover/FailoverTransport.java#FailoverTransport
> >> > > > > > >
> >> > > > > > how maxReconnectAttempts=0 gets processed to mean "don't try
> to
> >> > > > > reconnect",
> >> > > > > > allowing the network bridge to re-establish the connection,
> and
> >> > there
> >> > > > are
> >> > > > > > notes in
> >> > > http://activemq.apache.org/failover-transport-reference.html
> >> > > > > > explaining that this interpretation of the value "0" was
> >> > implemented
> >> > > in
> >> > > > > > 5.6.0 (https://issues.apache.org/jira/browse/AMQ-3542).
> >> There's
> >> > no
> >> > > > > > similar
> >> > > > > > code in SimpleDiscoveryAgent
> >> > > > > > <
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.activemq/activemq-all/5.10.0/org/apache/activemq/transport/discovery/simple/SimpleDiscoveryAgent.java#SimpleDiscoveryAgent
> >> > > > > > >
> >> > > > > > (which handles connection attempts for the static: transport
> >> > > > > > <http://activemq.apache.org/static-transport-reference.html>,
> >> as I
> >> > > > > > understand it) to interpret "-1" as "reconnect forever" and
> "0"
> >> as
> >> > > > "don't
> >> > > > > > reconnect".
> >> > > > > >
> >> > > > > > Is Gary's and Torsten's advice about maxReconnectAttempts not
> >> > > > applicable
> >> > > > > to
> >> > > > > > static: network connectors for some reason that I'm not
> >> > > understanding?
> >> > > > > Or
> >> > > > > > should the changes Gary made in AMQ-3542 have been applied to
> >> all
> >> > > > > protocols
> >> > > > > > that include reconnection attempts?  (Do I need to open a JIRA
> >> for
> >> > > > this?)
> >> > > > > >
> >> > > > > > And a related question: when using the static: transport to
> >> > > establish a
> >> > > > > > broker mesh, if we set maxReconnectAttempts=0, is there a way
> to
> >> > > > perform
> >> > > > > > exponential backoff at the network bridge, so it doesn't
> >> > continually
> >> > > > try
> >> > > > > to
> >> > > > > > reconnect (and spam the logs) when one broker in the mesh is
> >> > offline
> >> > > > for
> >> > > > > a
> >> > > > > > while?  The only way I see to control exponential backoff is
> >> within
> >> > > the
> >> > > > > > static: transport via the useExponentialBackOff=true option;
> >> > > searching
> >> > > > > the
> >> > > > > > source code (I'm looking at 5.8.0), I don't see any references
> >> to
> >> > > > > > exponential backoff in any code that seems to be related to
> >> network
> >> > > > > > bridges...
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Tim
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > http://redhat.com
> >> > > > > http://blog.garytully.com
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > http://redhat.com
> >> > > http://blog.garytully.com
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> http://redhat.com
> >> http://blog.garytully.com
> >>
> >
> >
>