You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Liyin Tang (Created) (JIRA)" <ji...@apache.org> on 2011/10/10 21:52:30 UTC

[jira] [Created] (HBASE-4568) Make zk dump jsp response more quickly

Make zk dump jsp response more quickly
--------------------------------------

                 Key: HBASE-4568
                 URL: https://issues.apache.org/jira/browse/HBASE-4568
             Project: HBase
          Issue Type: Improvement
            Reporter: Liyin Tang
            Assignee: Liyin Tang


For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  Configuration conf = master.getConfiguration();
  HBaseAdmin hbadmin = new HBaseAdmin(conf);
  HConnection connection = hbadmin.getConnection();
  ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


So we can simplify this:
  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Liyin Tang (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liyin Tang updated HBASE-4568:
------------------------------

    Description: 
For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  Configuration conf = master.getConfiguration();
  HBaseAdmin hbadmin = new HBaseAdmin(conf);
  HConnection connection = hbadmin.getConnection();
  ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


So we can simplify this:
  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.




  was:
For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  Configuration conf = master.getConfiguration();
  HBaseAdmin hbadmin = new HBaseAdmin(conf);
  HConnection connection = hbadmin.getConnection();
  ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


So we can simplify this:
  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.


    
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128377#comment-13128377 ] 

Hudson commented on HBASE-4568:
-------------------------------

Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/])
    HBASE-4568 Make zk dump jsp response faster

nspiegelberg : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/trunk/src/main/resources/hbase-webapps/master/zk.jsp

                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: HBASE-4568.patch
>
>
> 1) For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> Configuration conf = master.getConfiguration();
> HBaseAdmin hbadmin = new HBaseAdmin(conf);
> HConnection connection = hbadmin.getConnection();
> ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> </code>
> So we can simplify this:
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> </code>
> 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
> 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127252#comment-13127252 ] 

Hudson commented on HBASE-4568:
-------------------------------

Integrated in HBase-0.92 #64 (See [https://builds.apache.org/job/HBase-0.92/64/])
    HBASE-4568  Make zk dump jsp response faster

nspiegelberg : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java
* /hbase/branches/0.92/src/main/resources/hbase-webapps/master/zk.jsp

                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: HBASE-4568.patch
>
>
> 1) For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> Configuration conf = master.getConfiguration();
> HBaseAdmin hbadmin = new HBaseAdmin(conf);
> HConnection connection = hbadmin.getConnection();
> ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> </code>
> So we can simplify this:
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> </code>
> 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
> 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127060#comment-13127060 ] 

jiraposter@reviews.apache.org commented on HBASE-4568:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2385/
-----------------------------------------------------------

(Updated 2011-10-13 22:36:46.694728)


Review request for hbase, Dhruba Borthakur, Michael Stack, Jonathan Gray, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Karthik Ranganathan, and Nicolas Spiegelberg.


Summary
-------

1) For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
Configuration conf = master.getConfiguration();
HBaseAdmin hbadmin = new HBaseAdmin(conf);
HConnection connection = hbadmin.getConnection();
ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
</code>

So we can simplify this:
<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
</code>

2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.


This addresses bug HBASE-4568.
    https://issues.apache.org/jira/browse/HBASE-4568


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java 61ea552 
  src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java b8c4f61 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 699a5f5 
  src/main/resources/hbase-webapps/master/zk.jsp b31d94c 

Diff: https://reviews.apache.org/r/2385/diff


Testing
-------

Running all the unit tests


Thanks,

Liyin


                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Nicolas Spiegelberg (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg resolved HBASE-4568.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.94.0
                   0.92.0
     Hadoop Flags: Reviewed
    
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.92.0, 0.94.0
>
>         Attachments: HBASE-4568.patch
>
>
> 1) For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> Configuration conf = master.getConfiguration();
> HBaseAdmin hbadmin = new HBaseAdmin(conf);
> HConnection connection = hbadmin.getConnection();
> ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> </code>
> So we can simplify this:
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> </code>
> 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
> 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126117#comment-13126117 ] 

stack commented on HBASE-4568:
------------------------------

+1 Liyin.
                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127058#comment-13127058 ] 

jiraposter@reviews.apache.org commented on HBASE-4568:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2385/
-----------------------------------------------------------

(Updated 2011-10-13 22:33:54.109200)


Review request for hbase, Dhruba Borthakur, Michael Stack, Jonathan Gray, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Karthik Ranganathan, and Nicolas Spiegelberg.


Changes
-------

Remove trailing space.


Summary
-------

1) For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
Configuration conf = master.getConfiguration();
HBaseAdmin hbadmin = new HBaseAdmin(conf);
HConnection connection = hbadmin.getConnection();
ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
</code>

So we can simplify this:
<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
</code>

2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.


This addresses bug HBASE-4568.
    https://issues.apache.org/jira/browse/HBASE-4568


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java 61ea552 
  src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java b8c4f61 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 699a5f5 
  src/main/resources/hbase-webapps/master/zk.jsp b31d94c 

Diff: https://reviews.apache.org/r/2385/diff


Testing
-------

Running all the unit tests


Thanks,

Liyin


                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127048#comment-13127048 ] 

jiraposter@reviews.apache.org commented on HBASE-4568:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2385/
-----------------------------------------------------------

Review request for hbase, Dhruba Borthakur, Michael Stack, Jonathan Gray, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Karthik Ranganathan, and Nicolas Spiegelberg.


Summary
-------

1) For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
Configuration conf = master.getConfiguration();
HBaseAdmin hbadmin = new HBaseAdmin(conf);
HConnection connection = hbadmin.getConnection();
ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
</code>

So we can simplify this:
<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
</code>

2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.


This addresses bug HBASE-4568.
    https://issues.apache.org/jira/browse/HBASE-4568


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java 61ea552 
  src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java b8c4f61 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 699a5f5 
  src/main/resources/hbase-webapps/master/zk.jsp b31d94c 

Diff: https://reviews.apache.org/r/2385/diff


Testing
-------

Running all the unit tests


Thanks,

Liyin


                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Liyin Tang (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liyin Tang updated HBASE-4568:
------------------------------

    Description: 
1) For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
Configuration conf = master.getConfiguration();
HBaseAdmin hbadmin = new HBaseAdmin(conf);
HConnection connection = hbadmin.getConnection();
ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
</code>

So we can simplify this:
<code>
HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
</code>

2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

  was:
For each zk dump, currently hbase will create a zk client instance every time. 
This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.

  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  Configuration conf = master.getConfiguration();
  HBaseAdmin hbadmin = new HBaseAdmin(conf);
  HConnection connection = hbadmin.getConnection();
  ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();


So we can simplify this:
  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
  ZooKeeperWatcher watcher = master.getZooKeeperWatcher();


Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
It would be nice to make this configurable and set it to a low time out.

When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.




    
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: HBASE-4568.patch
>
>
> 1) For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> Configuration conf = master.getConfiguration();
> HBaseAdmin hbadmin = new HBaseAdmin(conf);
> HConnection connection = hbadmin.getConnection();
> ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> </code>
> So we can simplify this:
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> </code>
> 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
> 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127067#comment-13127067 ] 

jiraposter@reviews.apache.org commented on HBASE-4568:
------------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2385/#review2574
-----------------------------------------------------------

+1. Changes look good Liyin.

- Kannan

On 2011-10-13 22:36:46, Liyin wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2385/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-13 22:36:46)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, Jonathan Gray, Mikhail Bautin, Pritam Damania, Prakash Khemani, Amitanand Aiyer, Kannan Muthukkaruppan, Jerry Chen, Karthik Ranganathan, and Nicolas Spiegelberg.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  1) For each zk dump, currently hbase will create a zk client instance every time. 
bq.  This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
bq.  
bq.  <code>
bq.  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
bq.  Configuration conf = master.getConfiguration();
bq.  HBaseAdmin hbadmin = new HBaseAdmin(conf);
bq.  HConnection connection = hbadmin.getConnection();
bq.  ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
bq.  </code>
bq.  
bq.  So we can simplify this:
bq.  <code>
bq.  HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
bq.  ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
bq.  </code>
bq.  
bq.  2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
bq.  It would be nice to make this configurable and set it to a low time out.
bq.  
bq.  When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
bq.  It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
bq.  Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
bq.  
bq.  3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.
bq.  
bq.  
bq.  This addresses bug HBASE-4568.
bq.      https://issues.apache.org/jira/browse/HBASE-4568
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/util/RetryCounter.java 61ea552 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java b8c4f61 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 699a5f5 
bq.    src/main/resources/hbase-webapps/master/zk.jsp b31d94c 
bq.  
bq.  Diff: https://reviews.apache.org/r/2385/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Running all the unit tests
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Liyin
bq.  
bq.

> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Nicolas Spiegelberg (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-4568:
---------------------------------------

    Attachment: HBASE-4568.patch

please remember to attach the patch to JIRA in addition to posting to Review Board.  Hopefully, we'll have RB or Phabricator automatically do this for us in the future.
                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>         Attachments: HBASE-4568.patch
>
>
> 1) For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> Configuration conf = master.getConfiguration();
> HBaseAdmin hbadmin = new HBaseAdmin(conf);
> HConnection connection = hbadmin.getConnection();
> ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> </code>
> So we can simplify this:
> <code>
> HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
> ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> </code>
> 2) Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.
> 3) The recoverable zookeeper should be real exponentially backoff when there is connection loss exception, which will give hbase much longer time window to recover from zk machine failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4568) Make zk dump jsp response more quickly

Posted by "Nicolas Spiegelberg (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127150#comment-13127150 ] 

Nicolas Spiegelberg commented on HBASE-4568:
--------------------------------------------

+1.  awesome job!
                
> Make zk dump jsp response more quickly
> --------------------------------------
>
>                 Key: HBASE-4568
>                 URL: https://issues.apache.org/jira/browse/HBASE-4568
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> For each zk dump, currently hbase will create a zk client instance every time. 
> This is quite slow when any machines in the quorum is dead. Because it will connect to each machine in the zk quorum again.
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   Configuration conf = master.getConfiguration();
>   HBaseAdmin hbadmin = new HBaseAdmin(conf);
>   HConnection connection = hbadmin.getConnection();
>   ZooKeeperWatcher watcher = connection.getZooKeeperWatcher();
> So we can simplify this:
>   HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER);
>   ZooKeeperWatcher watcher = master.getZooKeeperWatcher();
> Also when hbase call getServerStats() for each machine in the zk quorum, it hard coded the default time out as 1 min. 
> It would be nice to make this configurable and set it to a low time out.
> When hbase tries to connect to each machine in the zk quorum, it will create the socket, and then set the socket time out, and read it with this time out.
> It means hbase will create a socket and connect to the zk server with 0 time out at first, which will take a long time. 
> Because a timeout of zero is interpreted as an infinite timeout. The connection will then block until established or an error occurs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira