You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (JIRA)" <ji...@apache.org> on 2012/06/18 08:11:43 UTC

[jira] [Created] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

chunhui shen created HBASE-6228:
-----------------------------------

             Summary: Fixup daughters twice  cause daughter region assigned twice
                 Key: HBASE-6228
                 URL: https://issues.apache.org/jira/browse/HBASE-6228
             Project: HBase
          Issue Type: Bug
          Components: master
            Reporter: chunhui shen
            Assignee: chunhui shen


First, how fixup daughters twice happen?
1.we will fixupDaughters at the last of HMaster#finishInitialization
2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion

When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.

The detail is as the following:
Suppose region A is a splitted parent region, and its daughter region B is missing

1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.

2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416697#comment-13416697 ] 

Jimmy Xiang commented on HBASE-6228:
------------------------------------

@Chunhui, this is a very rare case. Is it better to synchronize SSH.isDaughterMissing()?  This method is not in critical path, so there should be no performance issue.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396036#comment-13396036 ] 

Hadoop QA commented on HBASE-6228:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532372/HBASE-6228.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2185//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2185//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2185//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2185//console

This message is automatically generated.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401690#comment-13401690 ] 

Jonathan Hsieh commented on HBASE-6228:
---------------------------------------

I'm -0.  (Not going to block if others are ok with it, but am just uncomfortable since there are no tests).

I've only recently started spending time looking at the recovery/bugs/races but my general impression is that I cannot easily tell if this patch (and several similar to it) are just pushing a race from one place to another.  If we add tests then we can detect the fact that this change doesn't re-introduce a previously solved problem.

I haven't thought out the locking idea yet but it seems if we have state races, locks could simplify the reasoning that may eliminate classes of subtle bugs.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6228:
--------------------------------

    Attachment: HBASE-6228v3.patch

Adding test case for this issue in the patch v3
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397365#comment-13397365 ] 

Hadoop QA commented on HBASE-6228:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532668/HBASE-6228v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2200//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2200//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2200//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2200//console

This message is automatically generated.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402091#comment-13402091 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Chunhui
Thanks for the testcase.  I ran the testcase without patch.  It passes many times even with out the fix.
{code}
                RegionTransition rt = RegionTransition.parseFrom(data);
                if (rt.getEventType() == EventHandler.EventType.RS_ZK_REGION_OPENED) {
                  // Make a new region plan
                  master.getAssignmentManager().getRegionPlan(daughterARegionState,
                      daughterARegionState.getServerName(), true);
                  return;
                }
{code}
This part of code may not work right.  Because the node state can be changed before that Thread.sleep is done.
And so if we don't update the region plan we cannot assure that double assignment can be done.  Even i was trying to write a testcase and was caught up in a similar state.  Good on you Chunhui.

                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406753#comment-13406753 ] 

stack commented on HBASE-6228:
------------------------------

For example, on study:

+ RegionState is unreliable figuring state of region in master's memory; you cannot rely on it to answer the bigger question of who a region belongs to: master or regionserver.
+ In assign, we are careful with retries.  We actually need to be more careful especially around things like socket timeout (see Maryann's recent issue).  Bulk assign does none of these checks.  Bulk assign was introduced originally to do assigns on cluster start; if anything failed, contract was we'd just crash out and restart cluster over.  That was how it was originally.  Now bulk assign is used all over -- e.g. in SSH -- in spite of its being loosey-goosey around failures.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397291#comment-13397291 ] 

chunhui shen commented on HBASE-6228:
-------------------------------------

@ram
So this happens when master restart and SSH goes in parallel only?
Yes, as the current logic, we only fixup daughter in the two places.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419691#comment-13419691 ] 

Jimmy Xiang commented on HBASE-6228:
------------------------------------

I'd like to fix this in HBASE-6381 by making sure SSH blocks on AM.processServerShutdown until the master has joined the cluster, and fixed missing daughters.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6228:
--------------------------------

    Attachment: HBASE-6228v2.patch

Reattaching patch v2
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402106#comment-13402106 ] 

chunhui shen commented on HBASE-6228:
-------------------------------------

@ram
Do you run the case in trunk? Maybe case is not good, but I run it and always failed without patch.

bq.update the region plan we cannot assure that double assignment can be done

If we restart master after fixup daughter, master won't contain any plans, so I just simulate it and ensure case could happen.

Another thing, I think I must say:
If we send open region RPC twice for one region, but not assigned twice because we have many protection through version. But we also have risk of data correctness.
Why?:
When region in opening, it will replay log edits and flush; if two regionservers do opening region together, could we ensure data correctness?
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406182#comment-13406182 ] 

Jonathan Hsieh edited comment on HBASE-6228 at 7/3/12 11:51 PM:
----------------------------------------------------------------

@Ram, @Chunhui

I think stack may have started looking at some of these things but let me give a list that I've been looking at which I think are related: (there are probably more).

HBASE-6012, HBASE-6060, HBASE-5914, HBASE-5882, HBASE-6147, HBASE-5916, HBASE-5546, HBASE-5816, HBASE-6160, HBASE-5918, HBASE-6016, HBASE-5927.

I've been gathering the list with the intent of seeing if there is a common pattern.  At the highest level, I think that we need somethign to protect meta from changes coming from multiple uncoordinated sources like RS's and HM.   My gut is that the long term solution is something like a region lock (possibly a ZK "regionlock") that is used to isolate and protect hbase metadata modifications.   Something like that would reduce the space of possible problems and hopefully make testing easier along the way.
                
      was (Author: jmhsieh):
    @Ram, @Chunhui

I think stack may have started looking at some of these things but let me give a list that I've been looking at which I think are related: (there are probably more).

HBASE-6012, HBASE-6060, HBASE-5914, HBASE-5882, HBASE-6147, HBASe-5916, HBASE-5546, HBASE-5816, HBASE_6160, HBASE-5918, HBASE-6016, HBASE-5927.

I've been gathering the list with the intent of seeing if there is a common pattern.  At the highest level, I think that we need somethign to protect meta from changes coming from multiple uncoordinated sources like RS's and HM.   My gut is that the long term solution is something like a region lock (possibly a ZK "regionlock") that is used to isolate and protect hbase metadata modifications.   Something like that would reduce the space of possible problems and hopefully make testing easier along the way.
                  
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13468204#comment-13468204 ] 

stack commented on HBASE-6228:
------------------------------

[~jxiang] and [~zjushch] What about this issue.  Is it fixed by hbase-6381?
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397286#comment-13397286 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Chunhui
So this happens when master restart and SSH goes in parallel only?
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397301#comment-13397301 ] 

chunhui shen commented on HBASE-6228:
-------------------------------------

bq.So if only SSH is going on and there is no master restart scenario how this patch will work?
In current logic, we will add daughter to META with serverName in the SplitTransaction. And we won't added daughter with serverName==null except fixup daughter.
However, master may die after added daughter but before assign it, and master won't assign it when restart

So I think we should added daughter with parent's serverName.

Thanks ram to find the hole
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403908#comment-13403908 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Chunhui
We are able to reproduce the double assignment scenario between SSH and master restart.
So i think we should have synchronized block or as Jon says we need some lock mechanism but that may be a bigger task.


                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438290#comment-13438290 ] 

Jimmy Xiang commented on HBASE-6228:
------------------------------------

@Chunhui, I posted a patch for HBASE-6381 at RB: https://reviews.apache.org/r/6535/
I could not block SSH till the master fixed missing daughters.  However, I tried to
make sure the master doesn't compete with SSH. Please take a look. Thanks.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6228:
------------------------------------------

    Status: Open  (was: Patch Available)
    
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406182#comment-13406182 ] 

Jonathan Hsieh commented on HBASE-6228:
---------------------------------------

@Ram, @Chunhui

I think stack may have started looking at some of these things but let me give a list that I've been looking at which I think are related: (there are probably more).

HBASE-6012, HBASE-6060, HBASE-5914, HBASE-5882, HBASE-6147, HBASe-5916, HBASE-5546, HBASE-5816, HBASE_6160, HBASE-5918, HBASE-6016, HBASE-5927.

I've been gathering the list with the intent of seeing if there is a common pattern.  At the highest level, I think that we need somethign to protect meta from changes coming from multiple uncoordinated sources like RS's and HM.   My gut is that the long term solution is something like a region lock (possibly a ZK "regionlock") that is used to isolate and protect hbase metadata modifications.   Something like that would reduce the space of possible problems and hopefully make testing easier along the way.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406715#comment-13406715 ] 

stack commented on HBASE-6228:
------------------------------

I've been looking at hbase-6060 as a background task (sorry, its taking me a while Ram and Rajesh to get back to you lot).  When I put together multiple threads (SSH, HMaster joining cluster, single vs bulk assigning/unassign, timeout monitor, zk callbacks etc.) and then try to trace state changes not only across multiple state keepers (RegionState, RegionInTransition, AM#this.regions and AM#this.servers) in the master process but then also too x-process master -> regionserver -> via zk, I want to throw out what we have and start over (smile).  That ain't going to happen though.

Meantime I think we need to identify patterns or practices and broadcast them so all can sign on.  For example, I appreciate stuff like Jimmy's small win simplifying AM breaking out RegionStates into a standalone class apart from AM.  This at least collects a bunch of in-memory state in the one place.

We also need to have more tests I'd say so we can have some confidence stuff still works when we shift things around.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402145#comment-13402145 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Jon 
I have added a testcase to show the problem.
bq.but my general impression is that I cannot easily tell if this patch (and several similar to it) are just pushing a race from one place to another.
Could you let us know what type of issues are like that, if you have some list with you?

@Chunhui
What if the master initialization and SSH both are in fixUpDaughters? Then we are bound to get double assignment right?
The scenario am telling is Master has just completed joinCluster().  By the time a RS went down while splitting .  SSH starts and it comes to processDeadRegion.
Later master also sees there are daughters to fixup.  Now we may end up in same problem? I think here we need some sync mechanism.  What do you feel Chunhui?

                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6228:
--------------------------------

    Attachment: HBASE-6228v2.patch

Updating patch: added daughter with parent region's serverName when fixup daughters
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400251#comment-13400251 ] 

chunhui shen commented on HBASE-6228:
-------------------------------------

@ram
bq.Now when trying to fix daughter it will find the the parent server name associated with it and it will not assign it right. 
I think it won't happen.

If master goes down before the assignment is done and after fixup daughters,  when master restart, it will assign the daughter through rebuildUserRegions because the added parent server name is a dead server.
 
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6228:
------------------------------------------

    Status: Patch Available  (was: Open)

Submitting to HadoopQA.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jimmy Xiang updated HBASE-6228:
-------------------------------

    Resolution: Implemented
        Status: Resolved  (was: Patch Available)

Yes, I think the issue is not there any more.

Since we are using SSH to handle dead servers in failover mode, this piece of code in HMaster to fixup daughter is not needed any more.  I will remove it in HBASE-6611.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6228:
--------------------------------

    Attachment: HBASE-6228.patch
    
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406273#comment-13406273 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Jon
Thanks for sharing the list.  
bq.(possibly a ZK "regionlock")
Yes Jon. I agree.  Still we keep getting issues in AM, SSH, Master restart etc. 
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402094#comment-13402094 ] 

Hadoop QA commented on HBASE-6228:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533624/HBASE-6228v3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2272//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2272//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2272//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2272//console

This message is automatically generated.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Ted Yu updated HBASE-6228:
----------------------------------

    Fix Version/s: 0.96.0
     Hadoop Flags: Reviewed

Patch looks good.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401326#comment-13401326 ] 

Jonathan Hsieh commented on HBASE-6228:
---------------------------------------

Is there anyway we can add tests to these subtle recovery fixes?  

Part of me says we should just take a something like lock on the region (in zk, possibly moving it into RIT) before we start fixing them up like this to make this obvious and to eliminate these classes of races.  
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13468913#comment-13468913 ] 

Jimmy Xiang edited comment on HBASE-6228 at 10/4/12 9:34 AM:
-------------------------------------------------------------

Yes, I think the issue is not there any more.

                
      was (Author: jxiang):
    Yes, I think the issue is not there any more.

Since we are using SSH to handle dead servers in failover mode, this piece of code in HMaster to fixup daughter is not needed any more.  I will remove it in HBASE-6611.
                  
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397297#comment-13397297 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

So as i see the code if in the normal case where SSH tries to fix up, if SSH sees there is no servername i.e the servername is null, it will try to fixup daughter and assign it.
As i see the patch now we are trying to remove code where we get ServerName as null.  So if only SSH is going on and there is no master restart scenario how this patch will work?  Sorry if my question is dumb?
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401539#comment-13401539 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Jon
So you not ok with this fix Jon? 
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401243#comment-13401243 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

+1 Chunhui.  Your explanation is right.  Sorry for making noise here. :)
Thanks.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402167#comment-13402167 ] 

Hadoop QA commented on HBASE-6228:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12533642/HBASE-6228v4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2273//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2273//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2273//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2273//console

This message is automatically generated.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399370#comment-13399370 ] 

ramkrishna.s.vasudevan commented on HBASE-6228:
-----------------------------------------------

@Chunhui
Going thro the patch again.  
Now you are trying to update the parent region server name into the daughter.  Now assume a region that was split and still not daughters are not added to META.
Now the RS goes down.  SSH will start and find the missing daughter.  Now you try to add the parent server name to the daughter.  Before the assignment is done master goes down.
Now when trying to fix daughter it will find the the parent server name associated with it and it will not assign it right.  
So i think here what we should try is to avoid two assignments.
What do you feel Chunhui?
In the previous patch because the check if the server was null is commented the problem is almost same but in a different way. Correct me if am wrong Chunhui.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Zhihong Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397544#comment-13397544 ] 

Zhihong Ted Yu commented on HBASE-6228:
---------------------------------------

+1 on patch v2.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6228:
------------------------------------------

    Status: Patch Available  (was: Open)
    
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402168#comment-13402168 ] 

chunhui shen commented on HBASE-6228:
-------------------------------------

@ram
Could you talk about sync mechanism?
Based on the current patch, I think we won't assign twice. If need, we could add a synchronized block for the following:
{code}isDaughterMissing(catalogTracker, daughter)) {
LOG.info("Fixup; missing daughter " + daughter.getRegionNameAsString());
MetaEditor.addDaughter(catalogTracker, daughter, null);
{code}

What do you think?
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397339#comment-13397339 ] 

Hadoop QA commented on HBASE-6228:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12532666/HBASE-6228v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2199//console

This message is automatically generated.
                
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6228) Fixup daughters twice cause daughter region assigned twice

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6228:
------------------------------------------

    Attachment: HBASE-6228v4.patch
    
> Fixup daughters twice  cause daughter region assigned twice
> -----------------------------------------------------------
>
>                 Key: HBASE-6228
>                 URL: https://issues.apache.org/jira/browse/HBASE-6228
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6228.patch, HBASE-6228v2.patch, HBASE-6228v2.patch, HBASE-6228v3.patch, HBASE-6228v4.patch
>
>
> First, how fixup daughters twice happen?
> 1.we will fixupDaughters at the last of HMaster#finishInitialization
> 2.ServerShutdownHandler will fixupDaughters when reassigning region through ServerShutdownHandler#processDeadRegion
> When fixupDaughters, we will added daughters to .META., but it coudn't prevent the above case, because FindDaughterVisitor.
> The detail is as the following:
> Suppose region A is a splitted parent region, and its daughter region B is missing
> 1.First, ServerShutdownHander thread fixup daughter, so add daughter region B to .META. with serverName=null, and assign the daughter.
> 2.Then, Master's initialization thread will also find the daughter region B is missing and assign it. It is because FindDaughterVisitor consider daughter is missing if its serverName=null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira