You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/10/07 22:07:31 UTC

[jira] Created: (HBASE-1892) [performance] make hbase splits run faster

[performance] make hbase splits run faster
------------------------------------------

                 Key: HBASE-1892
                 URL: https://issues.apache.org/jira/browse/HBASE-1892
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: stack
             Fix For: 0.21.0


hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836903#action_12836903 ] 

stack commented on HBASE-1892:
------------------------------

One hump that is in the way of faster splitting is that we do a flush while the close flag is up.  On a loaded hdfs just now, I saw that this flush took 10 seconds to write out an hfile of 40M which ain't the worst but during this ten seconds, the close flag on the region was up and so any one trying to write this region -- and probably read was barred access and probably off wandering in the wilderness looking for where the region now resides.

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>         Attachments: HBASE-1892.patch
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791710#action_12791710 ] 

Andrew Purtell commented on HBASE-1892:
---------------------------------------

bq. Maybe we should add to the RETRY_BACKOFF a bunch more 1s at the start... and maybe down the hbase.client.pause default from 2 seconds to 1.

And up the default number of retries. 

Sounds good to me. 

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791719#action_12791719 ] 

stack commented on HBASE-1892:
------------------------------

And maybe change the default period to 1 second instead of 3 or 2 even?  On small clusters probably not a prob. having them check in alot?

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791721#action_12791721 ] 

Jean-Daniel Cryans commented on HBASE-1892:
-------------------------------------------

Been thinking about exactly that a lot. +1

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HBASE-1892) [performance] make hbase splits run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1892.
--------------------------

    Hadoop Flags: [Reviewed]
      Resolution: Fixed

Committed to two branches and to trunk.  Thanks for the review Jon.  This patch seems to help with getting regions on line faster for sure in my testing.  I did as you suggested lowering threshold to 5MB and adding to hbase-default.xml configuration.

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.4, 0.20.5, 0.21.0
>
>         Attachments: 1892-preflush.patch, HBASE-1892.patch
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791697#action_12791697 ] 

stack commented on HBASE-1892:
------------------------------

Here are some observations by J-D from a recent (offlist) mail:

{code}
So, the way it currently works, you should expect pauses of 6 to 10
seconds every time there's a split. If the system has more load, it
can take a bit longer. This is "how it works" in the current master
architecture... [to be changed in 0.21 hbase -- St.Ack]. This configuration:

 <property>
   <name>hbase.regionserver.msginterval</name>
   <value>3000</value>
   <description>Interval between messages from the RegionServer to HMaster
   in milliseconds.  Default is 3 seconds.
   </description>
 </property>

makes it that every split takes up to 3 second to report to master and then
3 more seconds for the master to assign the new regions. Coupled with
that is the client retries backoff. In HConnectionManager we wait
hbase.client.pause (default 2 secs) times

public static int RETRY_BACKOFF[] = { 1, 1, 1, 2, 2, 4, 4, 8, 16, 32 };

in order. So for a split you wait 2 + 2 + 2 seconds in the client. If
for some reason it took longer to split, you will wait 6+2*2=10
seconds (or more).

{code}

A couple of notes on the above.

Not so long ago I changed the regionserver so that as soon as it had a message, it'd send the master and not wait on the hbase.regionserver.msginterval to elapse.

My guess is that if more than one regionserver, rare would be the case when we would have to wait 3 seconds to send out the assignment of daughters -- my guess is that a regionserver would check in well before then.

But then there is the open of the region and informing master.

Maybe we should add to the RETRY_BACKOFF a bunch more 1s at the start... and maybe down the hbase.client.pause default from 2 seconds to 1.

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1892) [performance] make hbase splits run faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791720#action_12791720 ] 

stack commented on HBASE-1892:
------------------------------

Sorry, above changes would be for 0.20.3.  In 0.21, should all be queues up in zk so should be more 'live'.  If good, I'll make a new issue for 0.20.3 to make splits faster.

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1892) [performance] make hbase splits run faster

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-1892:
--------------------------------------

    Attachment: HBASE-1892.patch

Little experiment trying to even further improve the splitting process. Basically it opens the region right away and then tells the master. The HLog thing I removed is because the debug was filling my disk, sorry.

Unfortunately, it's not faster. It takes a second to split, in the mean time the client first notices that the region (under split) isn't served, then refreshes once and the region server still didn't update .META., then it waits one second and finally finds the new region.

> [performance] make hbase splits run faster
> ------------------------------------------
>
>                 Key: HBASE-1892
>                 URL: https://issues.apache.org/jira/browse/HBASE-1892
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>             Fix For: 0.21.0
>
>         Attachments: HBASE-1892.patch
>
>
> hbase-1506 tried and failed making splits faster in 0.20 context.  This issue is about doing it in 0.21 where we'll have to tools to do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.