You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by "Ray Chaudhuri, Shirsha (Nokia - IN/Bangalore)" <sh...@nokia.com> on 2017/03/20 22:22:51 UTC

How to modify Client Connection timer

Hi,

We are using ZKFC  as a Zookeeper Client that tries to connect to Zookeeper Server at the time of bringup, when we sometimes encounter the following issue _

1.       Client retrieves the ZK Ensemble address (ensemble consists of 3 nodes)

2.       Client tries to connect to one of the ZK Server nodes

3.       However due to other on-going processing (required at the time of bringup) at the ZK Server Node, it fails to respond in time to the Client. The response goes after 3 seconds

4.       The Client side times out by then, typically seen around 1.6 sec

Is there a way this timer at the Client side can be reconfigured to a higher value?

Regards
Shirsha

Re: How to modify Client Connection timer

Posted by Patrick Hunt <ph...@apache.org>.
On Wed, Mar 22, 2017 at 2:19 PM, Ray Chaudhuri, Shirsha (Nokia -
IN/Bangalore) <sh...@nokia.com> wrote:

>
>
> Thanks Patrick.
>
> Yes, we did eventually figure out that this timeout value is initialised
> from the Hadoop ZKFC code with the value of "ha.zookeeper.session-timeout.
> ms" or if this is not set, then the default is 5000. Since the client
> Cnxn class sets this timer value to a value _  (session timeout) / (number
> of ZK servers), the timer value for the initial connection to complete is
> one third the value of 5000. Hence the timeout occurring at around 1.6s
> each time.
>
> Which brought us to the question that wouldn't this piece of code
> highlighted from the file zookeeper/ClientCnxn.java _
>     public ClientCnxn(String chrootPath, HostProvider hostProvider, int
> sessionTimeout, ZooKeeper zooKeeper,
>
>
>             ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
>
>
>             long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) {
>
>
>         this.zooKeeper = zooKeeper;
>
>
>         this.watcher = watcher;
>
>
>         this.sessionId = sessionId;
>
>
>         this.sessionPasswd = sessionPasswd;
>
>
>         this.sessionTimeout = sessionTimeout;
>
>
>         this.hostProvider = hostProvider;
>
>
>         this.chrootPath = chrootPath;
>
>
>
>
>
>         connectTimeout = sessionTimeout / hostProvider.size();
>
>
>         readTimeout = sessionTimeout * 2 / 3;
>
>
>         readOnly = canBeReadOnly;
>
>
>
>
>
>         sendThread = new SendThread(clientCnxnSocket);
>
>
>         eventThread = new EventThread();
>
>
>         this.clientConfig=zooKeeper.getClientConfig();
>
>
>     }
>
>
>
>
> Be leading to trouble if the number of ZK Servers is larger?
>
> This timeout value is initially used when the Client is waiting to connect
> with the server, even before it has negotiated a sessiontimeout value.
>
> So larger the number of ZK Servers, less will be the timer value.
> Shouldn’t we have a lower guard for it and ensure that atleast a min value
> is used for this timer?
>
>
Probably a good idea. I've seen some terrible thundering herd problems that
result from this very basic logic. Some form of exponential backoff would
probably even be better. (configurable?) That said I've never seen the size
of the ensemble be that much of an issue given the other issue.

You might also check the latest trunk, can't remember if someone's been
trying to look at this recently...

Regards,

Patrick


>
>
> Regards
>
> Shirsha
>
>
>
> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org]
> Sent: Wednesday, March 22, 2017 11:55 PM
> To: UserZooKeeper <us...@zookeeper.apache.org>
> Subject: Re: How to modify Client Connection timer
>
>
>
> You should be able to control that by increasing the session timeout. I'm
> not familiar with the client code you are using however. Additionally the
> ZK client should retry it's connection (as long as you don't close the ZK
>
> object) continually until it is able to reconnect.
>
>
>
> Patrick
>
>
>
> On Mon, Mar 20, 2017 at 3:22 PM, Ray Chaudhuri, Shirsha (Nokia -
>
> IN/Bangalore) <shirsha.ray_chaudhuri@nokia.com<mailto:shirsha.ray_
> chaudhuri@nokia.com>> wrote:
>
>
>
> > Hi,
>
> >
>
> > We are using ZKFC  as a Zookeeper Client that tries to connect to
>
> > Zookeeper Server at the time of bringup, when we sometimes encounter
>
> > the following issue _
>
> >
>
> > 1.       Client retrieves the ZK Ensemble address (ensemble consists of 3
>
> > nodes)
>
> >
>
> > 2.       Client tries to connect to one of the ZK Server nodes
>
> >
>
> > 3.       However due to other on-going processing (required at the time
> of
>
> > bringup) at the ZK Server Node, it fails to respond in time to the
> Client.
>
> > The response goes after 3 seconds
>
> >
>
> > 4.       The Client side times out by then, typically seen around 1.6 sec
>
> >
>
> > Is there a way this timer at the Client side can be reconfigured to a
>
> > higher value?
>
> >
>
> > Regards
>
> > Shirsha
>
> >
>

RE: How to modify Client Connection timer

Posted by "Ray Chaudhuri, Shirsha (Nokia - IN/Bangalore)" <sh...@nokia.com>.

Thanks Patrick.

Yes, we did eventually figure out that this timeout value is initialised from the Hadoop ZKFC code with the value of "ha.zookeeper.session-timeout.ms" or if this is not set, then the default is 5000. Since the client Cnxn class sets this timer value to a value _  (session timeout) / (number of ZK servers), the timer value for the initial connection to complete is one third the value of 5000. Hence the timeout occurring at around 1.6s each time.

Which brought us to the question that wouldn't this piece of code highlighted from the file zookeeper/ClientCnxn.java _
    public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,


            ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,


            long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) {


        this.zooKeeper = zooKeeper;


        this.watcher = watcher;


        this.sessionId = sessionId;


        this.sessionPasswd = sessionPasswd;


        this.sessionTimeout = sessionTimeout;


        this.hostProvider = hostProvider;


        this.chrootPath = chrootPath;





        connectTimeout = sessionTimeout / hostProvider.size();


        readTimeout = sessionTimeout * 2 / 3;


        readOnly = canBeReadOnly;





        sendThread = new SendThread(clientCnxnSocket);


        eventThread = new EventThread();


        this.clientConfig=zooKeeper.getClientConfig();


    }




Be leading to trouble if the number of ZK Servers is larger?

This timeout value is initially used when the Client is waiting to connect with the server, even before it has negotiated a sessiontimeout value.

So larger the number of ZK Servers, less will be the timer value. Shouldn’t we have a lower guard for it and ensure that atleast a min value is used for this timer?



Regards

Shirsha



-----Original Message-----
From: Patrick Hunt [mailto:phunt@apache.org]
Sent: Wednesday, March 22, 2017 11:55 PM
To: UserZooKeeper <us...@zookeeper.apache.org>
Subject: Re: How to modify Client Connection timer



You should be able to control that by increasing the session timeout. I'm not familiar with the client code you are using however. Additionally the ZK client should retry it's connection (as long as you don't close the ZK

object) continually until it is able to reconnect.



Patrick



On Mon, Mar 20, 2017 at 3:22 PM, Ray Chaudhuri, Shirsha (Nokia -

IN/Bangalore) <sh...@nokia.com>> wrote:



> Hi,

>

> We are using ZKFC  as a Zookeeper Client that tries to connect to

> Zookeeper Server at the time of bringup, when we sometimes encounter

> the following issue _

>

> 1.       Client retrieves the ZK Ensemble address (ensemble consists of 3

> nodes)

>

> 2.       Client tries to connect to one of the ZK Server nodes

>

> 3.       However due to other on-going processing (required at the time of

> bringup) at the ZK Server Node, it fails to respond in time to the Client.

> The response goes after 3 seconds

>

> 4.       The Client side times out by then, typically seen around 1.6 sec

>

> Is there a way this timer at the Client side can be reconfigured to a

> higher value?

>

> Regards

> Shirsha

>

Re: How to modify Client Connection timer

Posted by Patrick Hunt <ph...@apache.org>.
You should be able to control that by increasing the session timeout. I'm
not familiar with the client code you are using however. Additionally the
ZK client should retry it's connection (as long as you don't close the ZK
object) continually until it is able to reconnect.

Patrick

On Mon, Mar 20, 2017 at 3:22 PM, Ray Chaudhuri, Shirsha (Nokia -
IN/Bangalore) <sh...@nokia.com> wrote:

> Hi,
>
> We are using ZKFC  as a Zookeeper Client that tries to connect to
> Zookeeper Server at the time of bringup, when we sometimes encounter the
> following issue _
>
> 1.       Client retrieves the ZK Ensemble address (ensemble consists of 3
> nodes)
>
> 2.       Client tries to connect to one of the ZK Server nodes
>
> 3.       However due to other on-going processing (required at the time of
> bringup) at the ZK Server Node, it fails to respond in time to the Client.
> The response goes after 3 seconds
>
> 4.       The Client side times out by then, typically seen around 1.6 sec
>
> Is there a way this timer at the Client side can be reconfigured to a
> higher value?
>
> Regards
> Shirsha
>