You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2012/10/18 01:48:03 UTC

[jira] [Created] (HBASE-7006) [MTTR] Study distributed log splitting to see how we can make it faster

stack created HBASE-7006:
----------------------------

             Summary: [MTTR] Study distributed log splitting to see how we can make it faster
                 Key: HBASE-7006
                 URL: https://issues.apache.org/jira/browse/HBASE-7006
             Project: HBase
          Issue Type: Bug
            Reporter: stack
            Priority: Critical
             Fix For: 0.96.0


Just saw interesting issue where a cluster went down  hard and 30 nodes had 1700 WALs to replay.  Replay took almost an hour.  It looks like it could run faster that much of the time is spent zk'ing and nn'ing.

Putting in 0.96 so it gets a look at least.  Can always punt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7006) [MTTR] Study distributed log splitting to see how we can make it faster

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479513#comment-13479513 ] 

stack commented on HBASE-7006:
------------------------------

[~nkeywal] No sir.  Limit was 8 WALs but write rate overran the limit so almost 40 WALs each.
                
> [MTTR] Study distributed log splitting to see how we can make it faster
> -----------------------------------------------------------------------
>
>                 Key: HBASE-7006
>                 URL: https://issues.apache.org/jira/browse/HBASE-7006
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.96.0
>
>
> Just saw interesting issue where a cluster went down  hard and 30 nodes had 1700 WALs to replay.  Replay took almost an hour.  It looks like it could run faster that much of the time is spent zk'ing and nn'ing.
> Putting in 0.96 so it gets a look at least.  Can always punt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7006) [MTTR] Study distributed log splitting to see how we can make it faster

Posted by "nkeywal (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478770#comment-13478770 ] 

nkeywal commented on HBASE-7006:
--------------------------------

Nothing related to HBASE-6738?
There is not a limit of 32 WALs per node (hence 900 wals)? Or have you lost more nodes?
                
> [MTTR] Study distributed log splitting to see how we can make it faster
> -----------------------------------------------------------------------
>
>                 Key: HBASE-7006
>                 URL: https://issues.apache.org/jira/browse/HBASE-7006
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.96.0
>
>
> Just saw interesting issue where a cluster went down  hard and 30 nodes had 1700 WALs to replay.  Replay took almost an hour.  It looks like it could run faster that much of the time is spent zk'ing and nn'ing.
> Putting in 0.96 so it gets a look at least.  Can always punt.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira