You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2010/09/21 08:43:34 UTC

[jira] Created: (HBASE-3019) Make bulk assignment on cluster startup run faster

Make bulk assignment on cluster startup run faster
--------------------------------------------------

                 Key: HBASE-3019
                 URL: https://issues.apache.org/jira/browse/HBASE-3019
             Project: HBase
          Issue Type: Improvement
            Reporter: stack


Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.

Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3019) Make bulk assignment on cluster startup run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3019:
-------------------------

    Attachment: bulk-v10.txt

What was committed (What was on review board plus Jon and Ted suggested changes).

> Make bulk assignment on cluster startup run faster
> --------------------------------------------------
>
>                 Key: HBASE-3019
>                 URL: https://issues.apache.org/jira/browse/HBASE-3019
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.90.0
>
>         Attachments: bulk-v10.txt, bulk-v4.txt, bulk-v7.txt
>
>
> Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.
> Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3019) Make bulk assignment on cluster startup run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3019:
-------------------------

    Attachment: bulk-v4.txt

Patch that adds a bulk open region to the regionserver and that then has the assignment manager do bulk operations per server.

Currently, this patch does not make assignments faster than what we currently have.  Talking about 3 minutes to assign 2k regions across 9 servers currenlty vs 4 to 5 minutes for this patch.

Patch has concurrency issue and will play some more with it but seems like zk is bottleneck -- all the state changes that happen for a region assignment.

> Make bulk assignment on cluster startup run faster
> --------------------------------------------------
>
>                 Key: HBASE-3019
>                 URL: https://issues.apache.org/jira/browse/HBASE-3019
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>         Attachments: bulk-v4.txt
>
>
> Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.
> Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3019) Make bulk assignment on cluster startup run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3019:
-------------------------

    Attachment: bulk-v7.txt

So, I'm giving up on this tactic for now of trying to assign in bulk.  Its slower than whats in place currently mostly because we bulk set state in zk first, before we proceed to send bulk region open to the regionserver.  The bulk setting of state in zk takes time and in parts needs to be done under a synchronization block so regionsInTransition can be updated atomically.  In effect we proceed serially through servers.  Also, theres a problem transitioning states.  I've put a note in the patch.  Before moving region state to PENDING_OPEN, we need to wait on the zk callback that confirms setting state to OFFLINE.  Without this it the PENDING_OPEN can be set before OFFLINE has finished and we'll get ourselves into an unwanted state.  To go further with this patch, would need to change our zking to be async.

Though giving up on this bulk assign, will reuse the most of this patch in a new issue, hbase-3055, as it improves general bulk assign.





> Make bulk assignment on cluster startup run faster
> --------------------------------------------------
>
>                 Key: HBASE-3019
>                 URL: https://issues.apache.org/jira/browse/HBASE-3019
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>         Attachments: bulk-v4.txt, bulk-v7.txt
>
>
> Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.
> Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3019) Make bulk assignment on cluster startup run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916627#action_12916627 ] 

stack commented on HBASE-3019:
------------------------------

Duh.  So I was doing assignments twice in patch; once in Excecutor and once inside the loop that added Executor Runnable.

So, after more messing on cluster, this patch runs 10-20% faster than the old way of bulk assigning (Almost 4 minutes for old way vs just under 3 1/2 minutes for this bulk load patch loading 2k regions over 9 servers).  There isn't much in it but this patch should be a bit more robust than what was there previous and will do better on bigger cluster since has bounded ExecutorService rather than a thread per RS.

I and trying various like assigning in bulk ten regions at a time doing zk update and open rpc ten at a time but seemed to make no difference.  Also tried waiting on one servers updating all in zk, doing its bulk open, then moving to next but that seemed slower.

Putting patch up for review.

> Make bulk assignment on cluster startup run faster
> --------------------------------------------------
>
>                 Key: HBASE-3019
>                 URL: https://issues.apache.org/jira/browse/HBASE-3019
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>         Attachments: bulk-v4.txt, bulk-v7.txt
>
>
> Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.
> Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-3019) Make bulk assignment on cluster startup run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-3019.
--------------------------

     Hadoop Flags: [Reviewed]
         Assignee: stack
    Fix Version/s: 0.90.0
       Resolution: Fixed

Committed.  Thanks for review Jon and Ted.

> Make bulk assignment on cluster startup run faster
> --------------------------------------------------
>
>                 Key: HBASE-3019
>                 URL: https://issues.apache.org/jira/browse/HBASE-3019
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.90.0
>
>         Attachments: bulk-v4.txt, bulk-v7.txt
>
>
> Currently, as of HBASE-3018, we come up with a bulk assignment plan that is sorted by server.  We then spawn a thread to assign out the regions per server so we are assigning in parallel.  This works but is still slow enough (It looks to be slower than the old assignment where we'd do lumps of N regions at a time).  We should be able to pass a regionserver all the regions to open in one RPC.  We need to figure how to keep up zk state while regionserver is processing a big lot of regions.  This looks a little awkward to do since currently open handler just opens region -- there is no notion of doing a ping while waiting to run.
> Being able to start the cluster fast is important for those times we take it down to do major upgrade; the longer it takes to spin up, the longer our 'downtime'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.