You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (Created) (JIRA)" <ji...@apache.org> on 2011/11/28 06:18:39 UTC

[jira] [Created] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Region is on service before completing openRegionHanlder, may cause data loss
-----------------------------------------------------------------------------

                 Key: HBASE-4880
                 URL: https://issues.apache.org/jira/browse/HBASE-4880
             Project: HBase
          Issue Type: Bug
            Reporter: chunhui shen


OpenRegionHandler in regionserver is processed as the following steps:
{code}
1.openregion()(Through it, closed = false, closing = false)
2.addToOnlineRegions(region)
3.update .meta. table 
4.update ZK's node state to RS_ZK_REGION_OPEND
{code}
We can find that region is on service before Step 4.
It means client could put data to this region after step 3.
What will happen if step 4 is failed processing?
It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166180#comment-13166180 ] 

Hudson commented on HBASE-4880:
-------------------------------

Integrated in HBase-TRUNK-security #26 (See [https://builds.apache.org/job/HBase-TRUNK-security/26/])
    HBASE-4880 Region is on service before openRegionHandler completes, may cause data loss

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java

                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163928#comment-13163928 ] 

stack commented on HBASE-4880:
------------------------------

I took a look at the patch.  Yeah, its a bit messy but nice first cut at the problem.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163333#comment-13163333 ] 

Lars Hofhansl commented on HBASE-4880:
--------------------------------------

I would agree with Ram. Adding the region to onlineRegions, then removing it, just to add it again seems a bit hack'ish. We should avoid adding it in the first place. How much bigger would the patch be?
(Disclaimer: I do not know this part of the code very well).

                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164412#comment-13164412 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
-----------------------------------------------

The patch looks fine to me.. Checking the test failures. 
@Chenhui
Have you done some testing after this patch?
Nice work
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164973#comment-13164973 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@ram
I think so.
It is not needed in cleanupFailedOpen() for the current openRegionHanlder transaction
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158224#comment-13158224 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
-----------------------------------------------

If we are going to make this change, why don't we avoid the step of adding the region into online region before step 3.
And add it only at the end?
Let's see what others have to say on this.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164953#comment-13164953 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
-----------------------------------------------

@Chenuhui
A minor comment.  As we are not adding it into online regions and we are adding it only after the openRegionHandler is processed
The line
{code}
this.rsServices.removeFromOnlineRegions(regionInfo.getEncodedName());
{code}
may not be needed in cleanupFailedOpen(). What do you feel ? 
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164773#comment-13164773 ] 

Hadoop QA commented on HBASE-4880:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506518/4880.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -160 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/466//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/466//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/466//console

This message is automatically generated.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164083#comment-13164083 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@all
what about the patchv2?
I remove {code}
addToOnlineRegions(r);
{code} from HReionserver#postOpenDeployTasks.
It means that step2 and step3 are independent now, 
After calling postOpenDeployTasks, we should add region to onlineregions timely by the transaction
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4880:
--------------------------------

    Attachment: hbase-4880v2.patch
    
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165044#comment-13165044 ] 

stack commented on HBASE-4880:
------------------------------

@Chunhui Nice patch.  If you are going to make a new version to address Ram comment, here are some other comments to consider:

I think your patch can be smaller.  Much of your patch is trying to explain that postopendeploy now no longer onlines a region as part of its opening.  I think you do not have to explain this change in multiple places in the code; just change the code to no longer do it.   For example, this part of diff looks unneeded to me:

{code}
@@ -110,10 +109,14 @@
         tryTransitionToFailedOpen(regionInfo);
         return;
       }
-
       boolean failed = true;
       if (tickleOpening("post_region_open")) {
-        if (updateMeta(region)) failed = false;
+        // Just update .META. table ,but not add it to online regions.
+        // We shouldn't serve this region before master receive region opened
+        // message.(see HBASE-4880)
+        if (updateMeta(region)) {
+          failed = false;
+        }
       }
       if (failed || this.server.isStopped() ||
           this.rsServices.isStopping()) {
{code}

Its just comments but I think that to a reader who comes along later, they won't be able to make much sense of what the comment is about -- not without hbase-4880 (in other words, the comment will have a short shelf-life; it makes sense to the few of us who have been looking at this bit of code but soon we'll all forget that postopendeploy used online the region too.

Similar with this change:

{code}
@@ -212,9 +217,11 @@
   }
 
   /**
-   * Thread to run region post open tasks.  Call {@link #getException()} after
-   * the thread finishes to check for exceptions running
-   * {@link RegionServerServices#postOpenDeployTasks(HRegion, org.apache.hadoop.hbase.catalog.CatalogTracker, boolean)}.
+   * Thread to run region post open tasks, which would update meta, but would
+   * not add region to OnlineRegions . Call {@link #getException()} after the
+   * thread finishes to check for exceptions running
+   * {@link RegionServerServices#postOpenDeployTasks(HRegion, org.apache.hadoop.hbase.catalog.CatalogTracker, boolean)}
+   * .
    */
   static class PostOpenDeployTasksThread extends Thread {
     private Exception exception = null;
{code}

and the change to the class RegionServerServices.

Otherwise, patch is great. +1.  Would be nice to have a test for it but its a bit too ornery of a context to conjure in tests so I'd be fine w/ it going into trunk and 0.92.

                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163263#comment-13163263 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
-----------------------------------------------

I am ok with the change.  What if we add the region to online region only after we are sure that META update is successful.  
Chenhui, feels the change will be bigger if we do like that.  If others are ok with the change then I am +1 on it.

                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164155#comment-13164155 ] 

Hadoop QA commented on HBASE-4880:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506380/hbase-4880v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -160 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
                  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
                  org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/457//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/457//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/457//console

This message is automatically generated.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161502#comment-13161502 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

How to deal with this issue?

                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165837#comment-13165837 ] 

Ted Yu commented on HBASE-4880:
-------------------------------

+1 on patch v4.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-4880:
------------------------------

    Affects Version/s:     (was: 0.19.2)
                       0.92.0
    
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-4880:
------------------------------

    Affects Version/s: 0.94.0
                       0.19.2
              Summary: Region is on service before openRegionHandler completes, may cause data loss  (was: Region is on service before completing openRegionHanlder, may cause data loss)
    
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.19.2, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165977#comment-13165977 ] 

Hudson commented on HBASE-4880:
-------------------------------

Integrated in HBase-TRUNK #2532 (See [https://builds.apache.org/job/HBase-TRUNK/2532/])
    HBASE-4880 Region is on service before openRegionHandler completes, may cause data loss

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java

                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166251#comment-13166251 ] 

Hudson commented on HBASE-4880:
-------------------------------

Integrated in HBase-0.92 #180 (See [https://builds.apache.org/job/HBase-0.92/180/])
    HBASE-4880 Region is on service before openRegionHandler completes, may cause data loss

stack : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java

                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164122#comment-13164122 ] 

Zhihong Yu commented on HBASE-4880:
-----------------------------------

Patch v2 is cleaner.

Minor comment:
The change in RegionServerServices.java shouldn't put explanation for parameters on a separate line - parameter name and explanation should be on the same line.

Good job Chunhui.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164149#comment-13164149 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@Ted
Amend the problem of putting explanation for parameters on a separate line in patchv3.
Please check.
Thanks
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165737#comment-13165737 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

Address Ram and Stack comments in patchv4
Thanks.
Please check
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165783#comment-13165783 ] 

ramkrishna.s.vasudevan commented on HBASE-4880:
-----------------------------------------------

+1 on patch. 
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158331#comment-13158331 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

Test results in the Linux OS:
{code}
Results :

Failed tests:   testClosing(org.apache.hadoop.hbase.client.TestHCM)

Tests run: 1174, Failures: 1, Errors: 0, Skipped: 8

[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:01:55.336s
[INFO] Finished at: Mon Nov 28 17:08:43 CST 2011
[INFO] Final Memory: 35M/351M
[INFO] ------------------------------------------------------------------------
{code}
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164133#comment-13164133 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@Ted
I use the code style by the HBASE-3678.
It will put explanation for parameters on a separate line automatically.
Is there any problem of HBASE-3678 ?
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4880:
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

Committed branch and trunk.  Nice work Chunhui.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163473#comment-13163473 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

I think we should separate the step2 and step3 in the HRegionserver#postOpenDeployTasks.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166080#comment-13166080 ] 

Hudson commented on HBASE-4880:
-------------------------------

Integrated in HBase-0.92-security #34 (See [https://builds.apache.org/job/HBase-0.92-security/34/])
    HBASE-4880 Region is on service before openRegionHandler completes, may cause data loss

stack : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/TestEndToEndSplitTransaction.java

                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163953#comment-13163953 ] 

stack commented on HBASE-4880:
------------------------------

@Zhihong We need to fix the above split issue if we refactor so we only put region online as last step?
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-4880:
--------------------------

    Status: Patch Available  (was: Open)
    
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158234#comment-13158234 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@ramkrishna
If we avoid the step of adding the region into online region before step 3.
We need modify the HReionserver#postOpenDeployTasks, it seems changing too much.
In my patchv1, remove it from online regions immediately at memory level after updating .META.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4880:
--------------------------------

    Attachment: hbase-4880v4.patch
    
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164777#comment-13164777 ] 

Zhihong Yu commented on HBASE-4880:
-----------------------------------

Latest patch passes all tests.

+1.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162278#comment-13162278 ] 

Ted Yu commented on HBASE-4880:
-------------------------------

I think the patch makes sense.

I ran TestHCM with patch applied. It passed.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158242#comment-13158242 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@Ted
I'm running test for trunk now.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4880:
--------------------------------

    Attachment: hbase-4880v3.patch
    
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163925#comment-13163925 ] 

stack commented on HBASE-4880:
------------------------------

Excellent detective work here lads.

I too would lean toward Rams' original suggestion, that we only online the region after zk update and edit to .meta. has gone in.


                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164169#comment-13164169 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

I think it is not related to this patch.
The same failed tests appear in 
https://builds.apache.org/job/PreCommit-HBASE-Build/456/testReport/
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164104#comment-13164104 ] 

Ted Yu commented on HBASE-4880:
-------------------------------

Did RestAdmin pass for patch v2?
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164104#comment-13164104 ] 

Ted Yu edited comment on HBASE-4880 at 12/7/11 3:12 AM:
--------------------------------------------------------

Did TestAdmin pass for patch v2?
                
      was (Author: yuzhihong@gmail.com):
    Did RestAdmin pass for patch v2?
                  
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-4880:
--------------------------------

    Attachment: hbase-4880.patch

Region isn't on service unitl completing openRegionHanlder successfully.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164154#comment-13164154 ] 

Hadoop QA commented on HBASE-4880:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506391/hbase-4880v3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -160 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestColumnSeeking

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/458//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/458//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/458//console

This message is automatically generated.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-4880:
------------------------------

    Status: Patch Available  (was: Open)

Patch testing again.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158240#comment-13158240 ] 

Ted Yu commented on HBASE-4880:
-------------------------------

@Chunhui:
HadoopQA isn't functional at the moment.

Can you let us know if test suite passes ?
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165779#comment-13165779 ] 

Hadoop QA commented on HBASE-4880:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12506692/hbase-4880v4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated -160 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 74 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/477//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/477//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/477//console

This message is automatically generated.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-4880:
------------------------------

    Attachment: 4880.txt
    
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163943#comment-13163943 ] 

Zhihong Yu commented on HBASE-4880:
-----------------------------------

TestAdmin#testForceSplit would fail if I comment out the following line in HRegionServer.postOpenDeployTasks():
{code}
-    addToOnlineRegions(r);
+    // addToOnlineRegions(r);
{code}
This is the error:
{code}
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: NotServingRegionException: 1 time, servers with issues: lm-sjn-xx:65442, 
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1641)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409)
	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892)
	at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725)
	at org.apache.hadoop.hbase.client.TestAdmin.splitTest(TestAdmin.java:805)
	at org.apache.hadoop.hbase.client.TestAdmin.testForceSplit(TestAdmin.java:108)
{code}
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "ramkrishna.s.vasudevan (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan reassigned HBASE-4880:
---------------------------------------------

    Assignee: chunhui shen
    
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164110#comment-13164110 ] 

chunhui shen commented on HBASE-4880:
-------------------------------------

@Ted
TestAdmin has passed for patch v2, and you could try again.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "stack (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-4880:
-------------------------

    Attachment: 4880-0.92.txt

0.92 patch is a little different; e.g. no test categories in branch.
                
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.0
>
>         Attachments: 4880-0.92.txt, 4880.txt, hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch, hbase-4880v4.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164605#comment-13164605 ] 

Zhihong Yu commented on HBASE-4880:
-----------------------------------

The three test failures would be fixed by addendum to HBASE-4927.
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4880) Region is on service before completing openRegionHanlder, may cause data loss

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164168#comment-13164168 ] 

Ted Yu commented on HBASE-4880:
-------------------------------

Please check failed tests. 
It seems trunk is broken now. 
                
> Region is on service before completing openRegionHanlder, may cause data loss
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4880) Region is on service before openRegionHandler completes, may cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-4880:
------------------------------

    Status: Open  (was: Patch Available)
    
> Region is on service before openRegionHandler completes, may cause data loss
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-4880
>                 URL: https://issues.apache.org/jira/browse/HBASE-4880
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.0, 0.94.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: hbase-4880.patch, hbase-4880v2.patch, hbase-4880v3.patch
>
>
> OpenRegionHandler in regionserver is processed as the following steps:
> {code}
> 1.openregion()(Through it, closed = false, closing = false)
> 2.addToOnlineRegions(region)
> 3.update .meta. table 
> 4.update ZK's node state to RS_ZK_REGION_OPEND
> {code}
> We can find that region is on service before Step 4.
> It means client could put data to this region after step 3.
> What will happen if step 4 is failed processing?
> It will execute OpenRegionHandler#cleanupFailedOpen which will do closing region, and master assign this region to another regionserver.
> If closing region is failed, the data which is put between step 3 and step 4 may loss, because the region has been opend on another regionserver and be put new data. Therefore, it may not be recoverd through replayRecoveredEdit() because the edit's LogSeqId is smaller than current region SeqId.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira