You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "nkeywal (JIRA)" <ji...@apache.org> on 2012/05/19 10:47:02 UTC

[jira] [Created] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

nkeywal created HBASE-6058:
------------------------------

             Summary: Use ZK 3.4 API 'multi' in bulk assignment
                 Key: HBASE-6058
                 URL: https://issues.apache.org/jira/browse/HBASE-6058
             Project: HBase
          Issue Type: Improvement
          Components: master, zookeeper
    Affects Versions: 0.96.0
            Reporter: nkeywal
            Assignee: nkeywal
            Priority: Minor


We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.

On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "Jesse Yates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397917#comment-13397917 ] 

Jesse Yates commented on HBASE-6058:
------------------------------------

I think the consensus is that we don't move until 3.4 is considered stable. I've got a couple other patches up for other various zkutil things, for when we do move. Hopefully this happens around 0.96, especially since Patrick Hunt was saying that there haven't been any major issues around 3.4 (particularly around the multi stuff).
                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "Jesse Yates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398549#comment-13398549 ] 

Jesse Yates commented on HBASE-6058:
------------------------------------

bq. supporting both ZK version (with & without multi) would be looking for issue imho

Totally agree.

bq. Anyway, I will redo some perfo tests to see where we are now with the current implementation

Great! I'd love to see how big of a difference it makes. 
                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "nkeywal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508963#comment-13508963 ] 

nkeywal commented on HBASE-6058:
--------------------------------

I've tested ZK#multi on assignment in master: no real improvement actually because we’re actually spending our time in the region server. We lower the load on ZK, but it would be visible only on a large cluster. As using multi would require to use ZK 3.4, it’s not compelling enough to do the move.

Note that's because:
- the master part I've changed is doing asynchronous writes, faster than synchronous writes
- ZooKeeper does nothing else. On a large cluster, it would be more interesting.
- there is no real bulk assign in the region server (i.e. a regionserver receives 20 regions simultaneously). So we don't need multi there today.

                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, Zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "nkeywal (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398282#comment-13398282 ] 

nkeywal commented on HBASE-6058:
--------------------------------

All the tests I made with multi were successful. But they were only tests :-).
Note it's not totally trivial to use it in bulk assignment, because we have two levels of asynchronous calls (the callback calls another asynchronous function). So supporting both ZK version (with & without multi) would be looking for issue imho.
And fixing ZOOKEEPER-1381 would help on deployment, today we hang if we call multi on a 3.3 ZK server...
Anyway, I will redo some perfo tests to see where we are now with the current implementation.

                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279559#comment-13279559 ] 

Zhihong Yu commented on HBASE-6058:
-----------------------------------

There hasn't been consensus as to which HBase version would require zk 3.4 as the minimum supported version.
                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6058) Use ZK 3.4 API 'multi' in bulk assignment

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397923#comment-13397923 ] 

Andrew Purtell commented on HBASE-6058:
---------------------------------------

We run 3.4 in production and have had no issues.

Multi has not had much use though, relative to the remainder of the ZK API. IMO, 0.96 is a good target for use of it, and we should get such usage in soon enough so we can have plenty of experience with it before cutting a release that uses it.
                
> Use ZK 3.4 API 'multi' in bulk assignment
> -----------------------------------------
>
>                 Key: HBASE-6058
>                 URL: https://issues.apache.org/jira/browse/HBASE-6058
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, zookeeper
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>
> We use async API today. This is already much much faster than the sync API. Still, it makes sense to use the 'multi' function: this will decrease the network & zookeeper load at startup/rolling restart.
> On a 500 nodes cluster, we see 3 that 3 seconds are spent on updating ZK per bulk assignment. This should cut it in half (+ the benefits on the network/zk load).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira