You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "chunhui shen (Created) (JIRA)" <ji...@apache.org> on 2011/11/30 06:59:39 UTC

[jira] [Created] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Region would be assigned twice easily with continually  killing server and moving region in testing environment
---------------------------------------------------------------------------------------------------------------

                 Key: HBASE-4899
                 URL: https://issues.apache.org/jira/browse/HBASE-4899
             Project: HBase
          Issue Type: Bug
            Reporter: chunhui shen


Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
however, this checking doesn't work as the excepted in the following case:
1.move region A from server B to server C
2.kill server B
3.start server B immediately

Let's see what happen in the code for the above case
{code}
for step1:
1.1 server B close the region A,
1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
1.3 server C start to open region A.(Not completed)
for step3:
master ServerShutdownHandler#process() for server B
{
..
splitlog()
...
List<RegionState> regionsInTransition =
        this.services.getAssignmentManager()
        .processServerShutdown(this.serverName);
...
Skip regions that were in transition unless CLOSING or PENDING_CLOSE
...
assign region
}

In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
Therefore, region A will be assigned twice.
{code}

Actually, one server killed and started twice will also easily cause region assigned twice.
Exclude the above reason, another probability : 
when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.

In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159881#comment-13159881 ] 

Hadoop QA commented on HBASE-4899:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505580/hbase-4899.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -162 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.TestFullLogReconstruction
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/406//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/406//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/406//console

This message is automatically generated.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181152#comment-13181152 ] 

ramkrishna.s.vasudevan commented on HBASE-4899:
-----------------------------------------------

Updated as committed only to Trunk and 0.92 and not in 0.90
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159858#comment-13159858 ] 

Ted Yu commented on HBASE-4899:
-------------------------------

{code}
+                  + " because it has been opened in "
+                  + addressFromAM.getServerName());
{code}
We should use the value of rit (RegionState) in the above log instead of hard coding 'opened'.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162081#comment-13162081 ] 

Hudson commented on HBASE-4899:
-------------------------------

Integrated in HBase-TRUNK-security #19 (See [https://builds.apache.org/job/HBase-TRUNK-security/19/])
    HBASE-4899 Region would be assigned twice easily with continually killing server and moving region in testing environment

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java

                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.1
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4899:
--------------------------------

    Attachment: hbase-4899v2.patch
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Ted Yu (Assigned) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu reassigned HBASE-4899:
-----------------------------

    Assignee: chunhui shen
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4899:
-------------------------

    Fix Version/s:     (was: 0.92.1)
                   0.90.0

This will be in 0.92 afterall; cutting a new RC.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4899:
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.1
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Applied to trunk and 0.92 branch.  I ran all tests and first time a TestReplication failed.  Its failing on trunk and 0.92 at mo.  Second time I ran it all tests passed.  Thanks for the patch Chunhui.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.1
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160437#comment-13160437 ] 

stack commented on HBASE-4899:
------------------------------

Good find.  Here are some comments on the patch.

{code}
+            if(rit!=null&&!rit.isClosing() && !rit.isPendingClose()){
{code}

Our style is to put spaces between operators as in 'if (rit != null &&...' etc.


+              LOG.debug("Skip assgining region "

The above should be assigning.


+                  + rit.getRegion().getRegionNameAsString()
+                  + " because in RIT region state: " + rit.getState());

I think just to rit.toString()?



+              continue;

This is confusing.  Is this code inside a loop and this continue takes us to the head of the loop again?  Its buried pretty deeply here so I'd not be surprised if readers missed it.  Shouldn't you rather write the code as if/else?


+            if (addressFromAM != null && !addressFromAM.equals(this.serverName)) {
+              LOG.debug("Skip assgining region "

Again, a misspelling.

I can fix the above small things on commit or maybe you want to have a go at it?

This is a pretty nice find.  Thanks for the patch.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4899:
-------------------------

             Priority: Critical  (was: Major)
    Affects Version/s:     (was: 0.92.0)
                       0.92.1

Upping priority and marking against 0.92.1.  Will pull into 0.92.0 if another RC.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160025#comment-13160025 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

In patchv1,{code}continue;{code} is forgot to add.
The above is test results with patchV2.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159859#comment-13159859 ] 

Ted Yu commented on HBASE-4899:
-------------------------------

@Chunhui:
Please let us know testing result on your QA environment.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4899:
--------------------------------

    Attachment: hbase-4899v3.patch
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-4899:
------------------------------------------

    Affects Version/s:     (was: 0.92.1)
                       0.92.0
        Fix Version/s:     (was: 0.90.0)
                       0.92.0
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159851#comment-13159851 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

{code}
Oct 20 12:51:36 dw75.kgb.sqa.cm4 2011-10-20 12:02:27,345 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting c70b69079782d421b34ac5e57ef06a35 on serverName=dw80.kgb.sqa.cm4,60020,1319083018615, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:37 dw75.kgb.sqa.cm4 2011-10-20 12:02:27,472 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a07d62ba400f631c781d7232ee397ed3 on serverName=dw81.kgb.sqa.cm4,60020,1319083018636, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:39 dw75.kgb.sqa.cm4 2011-10-20 12:02:27,756 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 05fb66d83cf2e4294a3cbd3f6757ec32 on serverName=dw90.kgb.sqa.cm4,60020,1319083018625, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:40 dw75.kgb.sqa.cm4 2011-10-20 12:02:27,805 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting e53b5cb8fcaaa178270f2dc876ebdd9a on serverName=dw90.kgb.sqa.cm4,60020,1319083018625, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:43 dw75.kgb.sqa.cm4 2011-10-20 12:02:28,825 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting d9ae4de5ba9045f4b060191222f774fd on serverName=dw81.kgb.sqa.cm4,60020,1319083018636, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:28,889 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 589e9269c31cdcd8aa073518c133f286 on serverName=dw90.kgb.sqa.cm4,60020,1319083018625, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:28,892 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting f35352a47f076c33a0f5ea140f76c943 on serverName=dw90.kgb.sqa.cm4,60020,1319083018625, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:28,964 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 5fecc8e3a17eed75ba7bb2de2bb5c762 on serverName=dw81.kgb.sqa.cm4,60020,1319083018636, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:29,009 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 0cdf089748ecdc79123670bf76dc68d5 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:29,071 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 6245b4e8b567755509ba6351ded55c92 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:29,380 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting fd52c86737233950bf509fb9ecf524a9 on serverName=dw80.kgb.sqa.cm4,60020,1319083018615, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:44 dw75.kgb.sqa.cm4 2011-10-20 12:02:29,461 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a30b7035359910a63d34b99402a9ef35 on serverName=dw81.kgb.sqa.cm4,60020,1319083018636, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:45 dw75.kgb.sqa.cm4 2011-10-20 12:02:29,918 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 363e647e5b9c96c2ad1acc6e44471b74 on serverName=dw80.kgb.sqa.cm4,60020,1319083018615, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:46 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,130 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting e996ca8ca0eede508af4e02964ae3de6 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:46 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,155 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting fb9fc31c5baa604fb76afd9df419ca8b on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:46 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,155 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting fb9fc31c5baa604fb76afd9df419ca8b on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:46 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,320 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting e9cbaa732a070c8947eb0603a84634c5 on serverName=dw80.kgb.sqa.cm4,60020,1319083018615, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:47 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,417 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a68d33b7bbbb4bfecee9679e8af6b991 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:47 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,494 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a708d26df7e2c62d09851eb0053cb567 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:47 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,513 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting b8bc0ba5ef9fd610b38afd76891b5506 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:47 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,835 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting adf8c774655437f14a72552eed6566c8 on serverName=dw81.kgb.sqa.cm4,60020,1319083018636, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 12:51:48 dw75.kgb.sqa.cm4 2011-10-20 12:02:30,984 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 555048e18d23e8d88ce34f3b2798d893 on serverName=dw79.kgb.sqa.cm4,60020,1319083018623, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:26 dw75.kgb.sqa.cm4 2011-10-20 13:12:00,731 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 4ff6a541f09be443db27482ac170077b on serverName=dw83.kgb.sqa.cm4,60020,1319087456454, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:26 dw75.kgb.sqa.cm4 2011-10-20 13:12:00,757 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 9203d1b6f17eaec9f62d3252cc0a4728 on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:28 dw75.kgb.sqa.cm4 2011-10-20 13:12:00,963 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 05bd751ae0894c2618b431ee03e33536 on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:28 dw75.kgb.sqa.cm4 2011-10-20 13:12:00,989 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 9d619820bc07ffbc6260fe8cac3ac8fb on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:28 dw75.kgb.sqa.cm4 2011-10-20 13:12:01,192 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting e53b5cb8fcaaa178270f2dc876ebdd9a on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:33 dw75.kgb.sqa.cm4 2011-10-20 13:12:02,401 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting aeb0ed71b9866626a79313d6f43e04d2 on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:34 dw75.kgb.sqa.cm4 2011-10-20 13:12:02,487 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting c2e4e403709ab6c41d9a2d9bc74e758e on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:34 dw75.kgb.sqa.cm4 2011-10-20 13:12:02,614 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 10fc5766fd94fd1e306c2f1b73e1476f on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:34 dw75.kgb.sqa.cm4 2011-10-20 13:12:02,614 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 10fc5766fd94fd1e306c2f1b73e1476f on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:37 dw75.kgb.sqa.cm4 2011-10-20 13:12:03,836 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 3e5da14300d6af039221c29564c5f3e3 on serverName=dw81.kgb.sqa.cm4,60020,1319087456329, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:37 dw75.kgb.sqa.cm4 2011-10-20 13:12:03,942 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a68d33b7bbbb4bfecee9679e8af6b991 on serverName=dw81.kgb.sqa.cm4,60020,1319087456329, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:37 dw75.kgb.sqa.cm4 2011-10-20 13:12:04,240 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 2d26053bb10898a5f203ce96e93a35b1 on serverName=dw81.kgb.sqa.cm4,60020,1319087456329, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:38 dw75.kgb.sqa.cm4 2011-10-20 13:12:04,271 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a708d26df7e2c62d09851eb0053cb567 on serverName=dw81.kgb.sqa.cm4,60020,1319087456329, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:38 dw75.kgb.sqa.cm4 2011-10-20 13:12:04,292 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting e996ca8ca0eede508af4e02964ae3de6 on serverName=dw81.kgb.sqa.cm4,60020,1319087456329, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:39 dw75.kgb.sqa.cm4 2011-10-20 13:12:04,463 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 901c30cb291925739451ef6997cd9750 on serverName=dw83.kgb.sqa.cm4,60020,1319087456454, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:39 dw75.kgb.sqa.cm4 2011-10-20 13:12:04,794 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting f39019fda9968274f0d6b44cc9281004 on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:40 dw75.kgb.sqa.cm4 2011-10-20 13:12:05,019 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting b140350ff7de2b2d2fcd353ef1b565d2 on serverName=dw90.kgb.sqa.cm4,60020,1319087454876, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:40 dw75.kgb.sqa.cm4 2011-10-20 13:12:05,089 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 47b40d219b1dba4b210b00ac7de13a6c on serverName=dw83.kgb.sqa.cm4,60020,1319087456454, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:40 dw75.kgb.sqa.cm4 2011-10-20 13:12:05,306 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 47893f2ac30964f06f8f65219077b60f on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 13:44:40 dw75.kgb.sqa.cm4 2011-10-20 13:12:05,351 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting a38ef110e6ef3a06f1fa1c25fba5be23 on serverName=dw79.kgb.sqa.cm4,60020,1319087456067, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 17:11:17 dw75.kgb.sqa.cm4 2011-10-20 17:11:17,048 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 70236052 on serverName=dw79.kgb.sqa.cm4,60020,1319101778268, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
Oct 20 21:35:24 dw75.kgb.sqa.cm4 2011-10-20 21:35:24,312 WARN org.apache.hadoop.hbase.master.AssignmentManager: Overwriting 88bd3816503151926f30916066750e08 on serverName=dw79.kgb.sqa.cm4,60020,1319114721135, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)

{code}
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> {code}
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180392#comment-13180392 ] 

ramkrishna.s.vasudevan commented on HBASE-4899:
-----------------------------------------------

@Stack and @Chunhui

This defect is not present in 0.90 branch? But the fixed version says it is in 0.90?
Do we need to add in 0.90.6? 

                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180554#comment-13180554 ] 

stack commented on HBASE-4899:
------------------------------

The fix version may be wrong Ram.  If you look at src and its not there, apply I'd say (and then set the fix version appropriately in here)
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.90.0
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4899:
--------------------------

    Status: Patch Available  (was: Open)
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160018#comment-13160018 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

{code}
Results :

Tests run: 1176, Failures: 0, Errors: 0, Skipped: 9

[INFO] 
[INFO] --- maven-surefire-plugin:2.11-TRUNK-HBASE-2:test (secondPartTestsExecution) @ hbase ---
[INFO] Tests are skipped.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:43:30.281s
[INFO] Finished at: Wed Nov 30 20:20:20 CST 2011
[INFO] Final Memory: 35M/361M
[INFO] ------------------------------------------------------------------------
{code}
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161514#comment-13161514 ] 

Hudson commented on HBASE-4899:
-------------------------------

Integrated in HBase-TRUNK #2508 (See [https://builds.apache.org/job/HBase-TRUNK/2508/])
    HBASE-4899 Region would be assigned twice easily with continually killing server and moving region in testing environment

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java

                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.1
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4899:
--------------------------------

    Attachment: hbase-4899.patch
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> {code}
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160057#comment-13160057 ] 

Hadoop QA commented on HBASE-4899:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505620/hbase-4899v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -162 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestTimeRangeMapRed
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
                  org.apache.hadoop.hbase.TestFullLogReconstruction
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapreduce.TestTableMapReduce

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/407//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/407//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/407//console

This message is automatically generated.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch, hbase-4899v2.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160611#comment-13160611 ] 

Hadoop QA commented on HBASE-4899:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12505704/hbase-4899v3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated -160 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 71 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.master.TestRollingRestart
                  org.apache.hadoop.hbase.master.TestRestartCluster
                  org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
                  org.apache.hadoop.hbase.master.TestDistributedLogSplitting
                  org.apache.hadoop.hbase.master.TestHMasterRPCException
                  org.apache.hadoop.hbase.mapreduce.TestTimeRangeMapRed
                  org.apache.hadoop.hbase.master.TestMaster
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFilesSplitRecovery
                  org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
                  org.apache.hadoop.hbase.TestDrainingServer
                  org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan
                  org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion
                  org.apache.hadoop.hbase.TestFullLogReconstruction
                  org.apache.hadoop.hbase.avro.TestAvroServer
                  org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles
                  org.apache.hadoop.hbase.master.TestMasterFailover
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.master.TestMasterTransitions
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
                  org.apache.hadoop.hbase.master.TestSplitLogManager
                  org.apache.hadoop.hbase.master.TestOpenedRegionHandler

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/413//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/413//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/413//console

This message is automatically generated.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4899:
--------------------------

          Description: 
Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
however, this checking doesn't work as the excepted in the following case:
1.move region A from server B to server C
2.kill server B
3.start server B immediately

Let's see what happen in the code for the above case
{code}
for step1:
1.1 server B close the region A,
1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
1.3 server C start to open region A.(Not completed)
for step3:
master ServerShutdownHandler#process() for server B
{
..
splitlog()
...
List<RegionState> regionsInTransition =
        this.services.getAssignmentManager()
        .processServerShutdown(this.serverName);
...
Skip regions that were in transition unless CLOSING or PENDING_CLOSE
...
assign region
}
{code}
In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
Therefore, region A will be assigned twice.

Actually, one server killed and started twice will also easily cause region assigned twice.
Exclude the above reason, another probability : 
when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.

In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

  was:
Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
however, this checking doesn't work as the excepted in the following case:
1.move region A from server B to server C
2.kill server B
3.start server B immediately

Let's see what happen in the code for the above case
{code}
for step1:
1.1 server B close the region A,
1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
1.3 server C start to open region A.(Not completed)
for step3:
master ServerShutdownHandler#process() for server B
{
..
splitlog()
...
List<RegionState> regionsInTransition =
        this.services.getAssignmentManager()
        .processServerShutdown(this.serverName);
...
Skip regions that were in transition unless CLOSING or PENDING_CLOSE
...
assign region
}

In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
Therefore, region A will be assigned twice.
{code}

Actually, one server killed and started twice will also easily cause region assigned twice.
Exclude the above reason, another probability : 
when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.

In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

    Affects Version/s: 0.92.0
    
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160677#comment-13160677 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

Testing result on my QA environment
{code}
Results :

Tests run: 1175, Failures: 0, Errors: 0, Skipped: 9

[INFO] 
[INFO] --- maven-surefire-plugin:2.11-TRUNK-HBASE-2:test (secondPartTestsExecution) @ hbase ---
[INFO] Tests are skipped.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:44:10.984s
[INFO] Finished at: Thu Dec 01 14:10:34 CST 2011
[INFO] Final Memory: 35M/380M
[INFO] ------------------------------------------------------------------------
{code}

please check!
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159921#comment-13159921 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

@Ted
{code}
+                  + " because it has been opened in "
+                  + addressFromAM.getServerName());
{code}
To the above log, the region must has been opend, because the return of AssignmentManager#getRegionServerOfRegion(region) is other server.

Test is running now.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161473#comment-13161473 ] 

Hudson commented on HBASE-4899:
-------------------------------

Integrated in HBase-0.92 #168 (See [https://builds.apache.org/job/HBase-0.92/168/])
    HBASE-4899 Region would be assigned twice easily with continually killing server and moving region in testing environment

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java

                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.1
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160630#comment-13160630 ] 

stack commented on HBASE-4899:
------------------------------

+1 on patch.

Running tests locally before commit.  The above failures shouldn't be because of this patch.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162059#comment-13162059 ] 

Hudson commented on HBASE-4899:
-------------------------------

Integrated in HBase-0.92-security #29 (See [https://builds.apache.org/job/HBase-0.92-security/29/])
    HBASE-4899 Region would be assigned twice easily with continually killing server and moving region in testing environment

stack : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java

                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.92.1
>
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160574#comment-13160574 ] 

chunhui shen commented on HBASE-4899:
-------------------------------------

@stack
I has amend it in patchV3.
Thanks.
                
> Region would be assigned twice easily with continually  killing server and moving region in testing environment
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4899
>                 URL: https://issues.apache.org/jira/browse/HBASE-4899
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>         Attachments: hbase-4899.patch, hbase-4899v2.patch, hbase-4899v3.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List<RegionState> regionsInTransition =
>         this.services.getAssignmentManager()
>         .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName), region A is in RIT (step1.3 not completed), but the return List<RegionState> regionsInTransition doesn't contain it, because region A has removed from AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed periodly, assigning region twice often happens, and it is hateful because it will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira