You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (Created) (JIRA)" <ji...@apache.org> on 2012/04/17 13:40:17 UTC

[jira] [Created] (HBASE-5806) Handle split region related failures on master restart and RS restart

Handle split region related failures on master restart and RS restart
---------------------------------------------------------------------

                 Key: HBASE-5806
                 URL: https://issues.apache.org/jira/browse/HBASE-5806
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.92.1
            Reporter: ramkrishna.s.vasudevan
             Fix For: 0.92.2, 0.96.0, 0.94.1


This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Status: Patch Available  (was: Open)
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264432#comment-13264432 ] 

stack commented on HBASE-5806:
------------------------------

Chinna This is good work.  Can you provide more detail on the scenarios you are fixing?  I'm trying to make sure I understand.  For example, for #1 above, we are talking about the regionserver crashing somewhere here (from SplitTransaction):

{code}
    if (!testing) {
      services.removeFromOnlineRegions(this.parent.getRegionInfo().getEncodedName());
    }
    this.journal.add(JournalEntry.OFFLINED_PARENT);
{code}

Is that right?

The patch seems reasonable.  Could we make a test at all?  There are some tests around splitting already w/ Mocks.  Could we add to these?

Good stuff.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270682#comment-13270682 ] 

stack commented on HBASE-5806:
------------------------------

bq. So can we add cases for RS_ZK_SPLIT* conditions with a log message with debug level. Because this is expected. If we have WARN, it may attract unnecessary attention. And have one default case again with a log message with WARN level?

Yes.  Add cases for RS_ZK_SPLIT* and log at DEBUG level that these callbacks are just passing through (or do not log if this is 'normal' operation).  Yes, I agree, logging at WARN level will make users think this an abnormal state which is not what we want them to think.

I would then restore the default case throwing IllegalStateException rather than log a WARN.  Now your change is more focused on addressing the issue found by Chinna.  Previous, the patch could catch illegal states that were other than RS_ZK_SPLIT*
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255487#comment-13255487 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

We will come up with a patch sooner. 
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270630#comment-13270630 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

@Stack
So can we add cases for RS_ZK_SPLIT* conditions with a log message with debug level. Because this is expected. If we have WARN, it may attract unnecessary attention.
And have one default case again with a log message with WARN level?
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270750#comment-13270750 ] 

stack commented on HBASE-5806:
------------------------------

bq. But for other states throwing the IllegalStateException will change the behaviour from the previous.

How is that?  Isn't it this patch that changes the IllegalStateException to instead do LOG.warn when we get to the 'default' case so I'd say all you are changing is the handling of RS_ZK_SPLIT* -- which is good since that is all this issue is concerned with?

Good stuff
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Attachment: HBASE-5806_trunk_1.patch
                HBASE-5806_0.94_1.patch
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl closed HBASE-5806.
--------------------------------

    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.94.1, 0.96.0
>
>         Attachments: HBASE-5806_0.92.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_0.94.patch, HBASE-5806.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272519#comment-13272519 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

Pls provide your comments on the latest patch.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269986#comment-13269986 ] 

Zhihong Yu commented on HBASE-5806:
-----------------------------------

+1 on patch v2.
{code}
+  private MockMasterWithoutCatalogJanitor abortAndWaitForMaster() throws IOException, InterruptedException {
+    cluster.abortMaster(0);
+    cluster.waitOnMaster(0);
+    cluster.getConfiguration().setClass(HConstants.MASTER_IMPL, MockMasterWithoutCatalogJanitor.class, HMaster.class);
{code}
Please wrap the two lines whose length exceeds 100 chars at time of integration.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274082#comment-13274082 ] 

stack commented on HBASE-5806:
------------------------------

+1 on on v3.

For the next time, not meant for this patch, Chinna, you could write the below...

{code}
+      case RS_ZK_REGION_SPLITTING:
+        LOG.debug("Processed region in state : " + et);
+        break;
+      case RS_ZK_REGION_SPLIT:
+        LOG.debug("Processed region in state : " + et);
+        break;
{code}

as

{code}
+      case RS_ZK_REGION_SPLITTING:
+      case RS_ZK_REGION_SPLIT:
+        LOG.debug("Processed region in state : " + et);
+        break;
{code}

..but lets get v3 in.  Good stuff.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269709#comment-13269709 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Updated patch for 0.94 & trunk.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261732#comment-13261732 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

I uploaded initial patch not yet tested. I am analyzing more on this.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272546#comment-13272546 ] 

Zhihong Yu commented on HBASE-5806:
-----------------------------------

The 3 tests reported by Hadoop QA passed locally with latest patch.

+1 from me.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274836#comment-13274836 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

Committed to trunk, 09.94 and 0.92.
Thanks for the patch Chinna.
Thanks for the review Stack and Ted.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274439#comment-13274439 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

Thanks Stack.  I will commit it tonight. (India time).
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270703#comment-13270703 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

@Stack
I feel adding cases for RS_ZK_SPLIT* and log with DEBUG level is ok.

But for other states throwing the IllegalStateException will change the behaviour from the previous. Need to check whether it is ok with the other states other than RS_ZK_SPLIT*. Previous because of throwing exception regions in state RS_ZK_SPLIT* was broken so i need to check thoroughly other states.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271613#comment-13271613 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Updated the patch with log message(DEBUG level) for RS_ZK_SPLIT* cases and thrown exception for the default case.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270277#comment-13270277 ] 

Hadoop QA commented on HBASE-5806:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525962/HBASE-5806_trunk_2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1794//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1794//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1794//console

This message is automatically generated.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13264950#comment-13264950 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------


for #1 above, 
RegionServer is crashed at SplitTransaction.createDaughters(Server, RegionServerServices) in  while removing from online regions()
{code}
    if (!testing) {
      services.removeFromOnlineRegions(this.parent.getRegionInfo().getEncodedName());
    }
{code}

Here where ever the regionserver is crashed the ephemeral node will be deleted and master will get the notification of nodeDeleted() where it will be cleared from RIT

But the ServerShutdownHandler executed first than the nodeDeleted() event for the region node.
You can see that from the below logs

{noformat}
2012-04-06 14:35:08,841 DEBUG org.apache.hadoop.hbase.master.handler.ServerShutdownHandler: Removed test,,1333702991530.cdfa837563e75ac5f4dc128680cc8da8. from list of regions to assign because in RIT; region state: SPLITTING

2012-04-06 14:35:12,981 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Ephemeral node deleted, regionserver crashed?, clearing from RIT; rs=test,,1333702991530.cdfa837563e75ac5f4dc128680cc8da8. state=SPLITTING, ts=1333703059260, server=HOST-10-18-40-25,60020,1333695183392
{noformat}

In this situation the below code populated that region

{code}
  List<RegionState> regionsInTransition =
        this.services.getAssignmentManager().
          processServerShutdown(this.serverName);
{code}

and it is in !rit.isClosing() && !rit.isPendingClose() so the region is deleted from the hris

{code}
      for (RegionState rit : regionsInTransition) {
        if (!rit.isClosing() && !rit.isPendingClose()) {
          LOG.debug("Removed " + rit.getRegion().getRegionNameAsString() +
          " from list of regions to assign because in RIT; region state: " +
          rit.getState());
          if (hris != null) hris.remove(rit.getRegion());
        }
      }
{code}
The fix in SSH addresses #1.
#2 came because of HBASE-5615.  However HBASE-5615 was reverted.
#3 comes when master restarts after sp1itting is done and before CJ has cleared the region from META. So while rebuilding the user region we ensure that the offlined parent region is not again taken into account.

#2 and #3 are together taken care in this patch such that the fix does solve both the problems.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272556#comment-13272556 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

Thanks Ted.
@Stack 
Your comments on the patch?

                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam reassigned HBASE-5806:
---------------------------------------

    Assignee: Chinna Rao Lalam
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269742#comment-13269742 ] 

Zhihong Yu commented on HBASE-5806:
-----------------------------------

Overall, patch looks good.
{code}
       default:
-        throw new IllegalStateException("Received event is not valid.");
+        break;
{code}
Should the event be logged in the default case ?
{code}
+          // If Znode not exists dont consider this region
+          if (data == null) {
{code}
'not exists' -> 'does not exist'
{code}
+  public static class MockedMaster extends HMaster {
{code}
I think the class should be private. A better name would be MockMasterWithoutCatalogJanitor.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270542#comment-13270542 ] 

stack commented on HBASE-5806:
------------------------------

On the IllegalStateException, should we make a case that catches the RS_ZK_SPLIT* conditions instead of changing the default so that ANY unexpected condition results in a warn log rather than an IllegalStateException?

I like your change to make it so master can be mocked.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269812#comment-13269812 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Thanks for the review Ted. Updated the patch with the review comments.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270242#comment-13270242 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Thanks for the review Stack. Patch updated with the debug message.

{code}
-    JVMClusterUtil.MasterThread mt =
-      JVMClusterUtil.createMasterThread(c,
-        this.masterClass, index);
+    JVMClusterUtil.MasterThread mt = JVMClusterUtil.createMasterThread(c,
+        (Class<? extends HMaster>) c.getClass(HConstants.MASTER_IMPL, HMaster.class), index);
     this.masterThreads.add(mt);
{code}

Added the above change for setting the MockMaster class while restarting the master  without this change master class is set through the property HConstants.MASTER_IMPL is not reflecting because masterClass is already loaded.


{code}
default:
-        throw new IllegalStateException("Received event is not valid.");
+        break;
       }
{code}

While restarting the master regions in RS_ZK_REGION_SPLIT or RS_ZK_REGION_SPLITTING state should not be handled in processRIT but here it is throwing the exception because of this it failed. Here it should not throw any exception.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Attachment: HBASE-5806.patch
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269707#comment-13269707 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Updated the patch with testcases. In trunk patch added this change in AssignmentManager.java to address the HBASE-5654 issue
{code}
       default:
-        throw new IllegalStateException("Received event is not valid.");
+        break;
       }
{code}
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269850#comment-13269850 ] 

Hadoop QA commented on HBASE-5806:
----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525884/HBASE-5806_trunk_1.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1790//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1790//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1790//console

This message is automatically generated.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261730#comment-13261730 ] 

Chinna Rao Lalam commented on HBASE-5806:
-----------------------------------------

Here considered following scenarios:

1.If the regionserver is restarted after removing the parent region from the online regions.
2.If Master is restarted while doing the region split in RS and it is in the flow of tickling from SPLIT-SPLIT or SPLITTING-SPLIT.
3.If Master is restarted after splitting is completely done and before deleting the region from META using catalogjanitor.

In the first scenario the problem is, in ServerShutdownHandler while constructing hris it will check whether it is in RIT and is !isClosing and !isPendingClose and it will remove from hris. Remaining hris it will try to assign and no one will attempt to assign that region.

{code}
      // Skip regions that were in transition unless CLOSING or PENDING_CLOSE
      for (RegionState rit : regionsInTransition) {
        if (!rit.isClosing() && !rit.isPendingClose() && !rit.isSplitting()) {
          LOG.debug("Removed " + rit.getRegion().getRegionNameAsString() +
          " from list of regions to assign because in RIT; region state: " +
          rit.getState());
          if (hris != null) hris.remove(rit.getRegion());
        }
      }
{code}


In the second scenario the problem is, in AssignmentManager while rebuildUserRegions() it should not consider the region which will have the znode with Split or Splitting state because region split might be completed or partially done.

In the third scenario the problem is, region split is completely done and it is not yet deleted from META using catalogjanitor so in 
AssignmentManager while rebuildUserRegions() it should not consider this region.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Attachment: HBASE-5806_trunk_2.patch
                HBASE-5806_0.94_2.patch
                HBASE-5806_0.92.patch
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279416#comment-13279416 ] 

Hudson commented on HBASE-5806:
-------------------------------

Integrated in HBase-0.94-security #27 (See [https://builds.apache.org/job/HBase-0.94-security/27/])
    HBASE-5806 Handle split region related failures on master restart and RS restart(Chinna rao) (Revision 1338331)

     Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java

                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Attachment: HBASE-5806_trunk.patch
                HBASE-5806_0.94.patch
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269759#comment-13269759 ] 

Hadoop QA commented on HBASE-5806:
----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12525862/HBASE-5806_trunk.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1787//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1787//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1787//console

This message is automatically generated.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HBASE-5806:
------------------------------------

    Attachment: HBASE-5806_trunk_3.patch
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278526#comment-13278526 ] 

Hudson commented on HBASE-5806:
-------------------------------

Integrated in HBase-0.92-security #107 (See [https://builds.apache.org/job/HBase-0.92-security/107/])
    HBASE-5806 Handle split region related failures on master restart and RS restart(Chinna rao) (Revision 1338335)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransactionOnCluster.java

                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270020#comment-13270020 ] 

stack commented on HBASE-5806:
------------------------------

@Chinna Very nice work.  Thanks.  I can wrap the long lines on commit.  Here's some questions on the patch just out of interest:

{code}
-    JVMClusterUtil.MasterThread mt =
-      JVMClusterUtil.createMasterThread(c,
-        this.masterClass, index);
+    JVMClusterUtil.MasterThread mt = JVMClusterUtil.createMasterThread(c,
+        (Class<? extends HMaster>) c.getClass(HConstants.MASTER_IMPL, HMaster.class), index);
     this.masterThreads.add(mt);
{code}

What brought on the above change?  Was this needed so you could add your mocking tests or is it that you fellas are doing a master subclass?

I don't understand why we need this change (because of hbase-5654 ?):

{code}
default:
-        throw new IllegalStateException("Received event is not valid.");
+        break;
       }
{code}

Log that we are going to pass on a region here?

{code}
+          // If znode does not exist dont consider this region
+          if (data == null) {
+            continue;
+          }
{code}

At debug level?
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271654#comment-13271654 ] 

Hadoop QA commented on HBASE-5806:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526183/HBASE-5806_trunk_3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestStore
                  org.apache.hadoop.hbase.TestDrainingServer
                  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1818//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1818//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1818//console

This message is automatically generated.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-5806:
------------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)
    
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.92.patch, HBASE-5806_0.94.patch, HBASE-5806_0.94_1.patch, HBASE-5806_0.94_2.patch, HBASE-5806_trunk.patch, HBASE-5806_trunk_1.patch, HBASE-5806_trunk_2.patch, HBASE-5806_trunk_3.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5806) Handle split region related failures on master restart and RS restart

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269746#comment-13269746 ] 

ramkrishna.s.vasudevan commented on HBASE-5806:
-----------------------------------------------

{code}
default:
-        throw new IllegalStateException("Received event is not valid.");
+        break;
{code}
Good that this testcase brought it out. 
{code}
+  public static class MockedMaster extends HMaster {
{code}
It has to be public otherwise the LocalHBaseCluster does not allow us to update the HMaster class.
                
> Handle split region related failures on master restart and RS restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5806
>                 URL: https://issues.apache.org/jira/browse/HBASE-5806
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: Chinna Rao Lalam
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-5806.patch, HBASE-5806_0.94.patch, HBASE-5806_trunk.patch
>
>
> This issue is raised to solve issues that comes out of partial region split happened and the region node in the ZK which is in RS_ZK_REGION_SPLITTING and RS_ZK_REGION_SPLIT is not yet processed.
> This also tries to address HBASE-5615.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira