You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Rajith Attapattu <ra...@gmail.com> on 2009/01/20 05:18:39 UTC

Some questions on Java client failover

During testing I noticed the following. I would appreciate if folks who are
interested in this area comment about these observations.

a) If a single broker is defined in the URL, by default FailoverSingleServer
is chosen.
In this case, I believe we should default to no failover unless explicitly
specified.

b) The FailoverSingleServer method retries immediately upon a connection
error.
    A connection error maybe due the broker crashing or due to a temp
network error.
   In either case, IMO there is little value in retrying immediately. It
would be best to retry after a delay.
   Rob pointed out that in the case of a fast dns switch this may come in
handy, but in other cases I believe a configurable timeout would be better.

c) The actual number of retries == the no of retries configured + 1.
Is there a reason for this logic?

Thoughts/comments are greatly appreciated.

Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

Re: Some questions on Java client failover

Posted by Rajith Attapattu <ra...@gmail.com>.
Hey Martin,

Thanks for the reply, appreciate it.
Comments inline.

On Tue, Jan 20, 2009 at 3:42 AM, Martin Ritchie <ri...@apache.org> wrote:

> 2009/1/20 Rajith Attapattu <ra...@gmail.com>:
> > During testing I noticed the following. I would appreciate if folks who
> are
> > interested in this area comment about these observations.
> >
> > a) If a single broker is defined in the URL, by default
> FailoverSingleServer
> > is chosen.
> > In this case, I believe we should default to no failover unless
> explicitly
> > specified.
>
> This is due to the historic merging of retry and failover. So
> SingleServerFailover is not really failover as it doesn't 'Failover'
> to anywhere it is simply a retry mechanism. Which I think most users
> would prefer.  know that is how most of the users I deal would prefere
> the default of automatically reconnecting.


Agreed. However as noted below we need to make it more robust so retrying is
done only when it makes sense.


>
> There is still work to be done here as retrying or failing over after
> a certain set of exceptions makes no sense, Authentication Exception
> being the most obvious. If the password was wrong once retrying it
> isn't going to make it work. :)


Agreed. Yes this area needs a bit of more work.


>
>
> > b) The FailoverSingleServer method retries immediately upon a connection
> > error.
> >    A connection error maybe due the broker crashing or due to a temp
> > network error.
> >   In either case, IMO there is little value in retrying immediately. It
> > would be best to retry after a delay.
> >   Rob pointed out that in the case of a fast dns switch this may come in
> > handy, but in other cases I believe a configurable timeout would be
> better.
>
> There is a 'connectdelay' property you can specify on a broker url to
> prevent immediate retry. It is documented on the Connection URL page:
> http://cwiki.apache.org/confluence/display/qpid/Connection+URL+Format
> There is also a 'connecttimeout' value that is configurable if you
> believe the network is going to be a bit ropey.


> > c) The actual number of retries == the no of retries configured + 1.
> > Is there a reason for this logic?
>
> This is just a bug but it may be due to the historic merging of retry
> and failover or a limitation of the failover desgin.
>

Yes it does seem like a bug.


>
> So what is actually happening is that when then connection fails we
> always retry and establish connection to the dropped broker assuming a
> transient networking issue has caused the problem. If that fails then
> we start failing over and the specified retry values for failover are
> used. Hence the number of 'retries' you are seeing.
>
> Hope that is helpful.
>
> Martin
>
> > Thoughts/comments are greatly appreciated.
> >
> > Regards,
> >
> > Rajith Attapattu
> > Red Hat
> > http://rajith.2rlabs.com/
> >
>
>
>
> --
> Martin Ritchie
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>


-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

Re: Some questions on Java client failover

Posted by Martin Ritchie <ri...@apache.org>.
2009/1/20 Rajith Attapattu <ra...@gmail.com>:
> During testing I noticed the following. I would appreciate if folks who are
> interested in this area comment about these observations.
>
> a) If a single broker is defined in the URL, by default FailoverSingleServer
> is chosen.
> In this case, I believe we should default to no failover unless explicitly
> specified.

This is due to the historic merging of retry and failover. So
SingleServerFailover is not really failover as it doesn't 'Failover'
to anywhere it is simply a retry mechanism. Which I think most users
would prefer.  know that is how most of the users I deal would prefere
the default of automatically reconnecting.

There is still work to be done here as retrying or failing over after
a certain set of exceptions makes no sense, Authentication Exception
being the most obvious. If the password was wrong once retrying it
isn't going to make it work. :)

> b) The FailoverSingleServer method retries immediately upon a connection
> error.
>    A connection error maybe due the broker crashing or due to a temp
> network error.
>   In either case, IMO there is little value in retrying immediately. It
> would be best to retry after a delay.
>   Rob pointed out that in the case of a fast dns switch this may come in
> handy, but in other cases I believe a configurable timeout would be better.

There is a 'connectdelay' property you can specify on a broker url to
prevent immediate retry. It is documented on the Connection URL page:
http://cwiki.apache.org/confluence/display/qpid/Connection+URL+Format
There is also a 'connecttimeout' value that is configurable if you
believe the network is going to be a bit ropey.

> c) The actual number of retries == the no of retries configured + 1.
> Is there a reason for this logic?

This is just a bug but it may be due to the historic merging of retry
and failover or a limitation of the failover desgin.

So what is actually happening is that when then connection fails we
always retry and establish connection to the dropped broker assuming a
transient networking issue has caused the problem. If that fails then
we start failing over and the specified retry values for failover are
used. Hence the number of 'retries' you are seeing.

Hope that is helpful.

Martin

> Thoughts/comments are greatly appreciated.
>
> Regards,
>
> Rajith Attapattu
> Red Hat
> http://rajith.2rlabs.com/
>



-- 
Martin Ritchie

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Re: Some questions on Java client failover

Posted by Carl Trieloff <cc...@redhat.com>.
>
> Here I was  refering to the case of a non clustered broker. That is when
> there is only a single broker defined in the connection URL.
> Even in the case of a clustered broker, currently the java client only
> iterates through the list provided in it's connection URL.
> It does not subscribe to the failover exchange to get an updated list of
> brokers. This feature will be added in the near future.

ok makes sense, we should also update the logic to also subscribe to the 
fail-over exchange for clustered brokers.

Carl.

Re: Some questions on Java client failover

Posted by Rajith Attapattu <ra...@gmail.com>.
On Tue, Jan 20, 2009 at 9:21 PM, Carl Trieloff <cc...@redhat.com>wrote:

> Rajith Attapattu wrote:
>
>> During testing I noticed the following. I would appreciate if folks who
>> are
>> interested in this area comment about these observations.
>>
>> a) If a single broker is defined in the URL, by default
>> FailoverSingleServer
>> is chosen.
>> In this case, I believe we should default to no failover unless explicitly
>> specified.
>>
>
> no, that is not correct. This list of URL's provided in this list are the
> brokers to try connect to the cluster with. Once
> the client has connected to a broker cluster the cluster notifies the
> client with the updated list of URL's in the cluster
> as nodes get added or removed from the cluster.
>

Here I was  refering to the case of a non clustered broker. That is when
there is only a single broker defined in the connection URL.
Even in the case of a clustered broker, currently the java client only
iterates through the list provided in it's connection URL.
It does not subscribe to the failover exchange to get an updated list of
brokers. This feature will be added in the near future.


> So on initial connection the client should iterate the list. once connected
> it should get an updated (full) list from the
> broker which replaces this list every time the broker sends the client a
> new list of URL's
>
> Carl.
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:dev-subscribe@qpid.apache.org
>
>


-- 
Regards,

Rajith Attapattu
Red Hat
http://rajith.2rlabs.com/

Re: Some questions on Java client failover

Posted by Carl Trieloff <cc...@redhat.com>.
Rajith Attapattu wrote:
> During testing I noticed the following. I would appreciate if folks who are
> interested in this area comment about these observations.
>
> a) If a single broker is defined in the URL, by default FailoverSingleServer
> is chosen.
> In this case, I believe we should default to no failover unless explicitly
> specified.

no, that is not correct. This list of URL's provided in this list are 
the brokers to try connect to the cluster with. Once
the client has connected to a broker cluster the cluster notifies the 
client with the updated list of URL's in the cluster
as nodes get added or removed from the cluster.

So on initial connection the client should iterate the list. once 
connected it should get an updated (full) list from the
broker which replaces this list every time the broker sends the client a 
new list of URL's

Carl.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org