You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (JIRA)" <ji...@apache.org> on 2012/05/21 10:23:41 UTC

[jira] [Created] (HBASE-6059) Replaying recovered edits would make deleted data exist again

chunhui shen created HBASE-6059:
-----------------------------------

             Summary: Replaying recovered edits would make deleted data exist again
                 Key: HBASE-6059
                 URL: https://issues.apache.org/jira/browse/HBASE-6059
             Project: HBase
          Issue Type: Bug
          Components: regionserver
            Reporter: chunhui shen
            Assignee: chunhui shen


When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.

Let's see how it happens. Suppose the region with two families(cf1,cf2)

1.put one data to the region (put r1,cf1:q1,v1)

2.move the region from server A to server B.

3.delete the data put by step 1(delete r1)

4.flush this region.

5.make major compaction for this region

6.move the region from server B to server A.

7.Abort server A

8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
(When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286024#comment-13286024 ] 

stack commented on HBASE-6059:
------------------------------

+1

Great work @Chunhui.  Nice find.   Nice tests too.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6059:
------------------------------

    Attachment:     (was: 6059v7.txt)
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280295#comment-13280295 ] 

Zhihong Yu commented on HBASE-6059:
-----------------------------------

If majorCompaction is false, we still need to check !kvs.isEmpty(), right ?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287262#comment-13287262 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

+1 for getting it in 0.94.  
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280188#comment-13280188 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Chunhui
I tried this
{code}
      if(writer == null && majorCompaction){
        writer = store.createWriterInTmp(maxKeyCount, compactionCompression,
            true);
      }
      if (writer != null) {
        writer.appendMetadata(maxId, majorCompaction);
        writer.close();
      }
{code}
It actually allows me .  But again not sure if all the testcases can run and any other scenario is possible with this. Just a wild try and it worked.
Also i noted one thing TestStore.testEmptyStoreFile(). JFYI.
Good on you Chunhui.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284369#comment-13284369 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Stack
Could you take a look at this solution and and patch?
@Ted
Is this ok to be committed? My concern was with creating an empty store file now. 
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287307#comment-13287307 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Chunhui
Thanks a lot.. If Lars can't commit it, I can review and commit it to 0.94.  
@Lars
Is it fine?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6059:
------------------------------

    Attachment: 6059v6.txt

Patch v6 modifies the comment in TestStore.java
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6059:
-------------------------

    Status: Patch Available  (was: Open)
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287254#comment-13287254 ] 

Lars Hofhansl commented on HBASE-6059:
--------------------------------------

Should we have this in 0.94.1?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286328#comment-13286328 ] 

Hudson commented on HBASE-6059:
-------------------------------

Integrated in HBase-TRUNK #2961 (See [https://builds.apache.org/job/HBase-TRUNK/2961/])
    HBASE-6059 Replaying recovered edits would make deleted data exist again (Revision 1344554)

     Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java

                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286230#comment-13286230 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12530290/6059v7.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
     

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2066//testReport/
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2066//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502568#comment-13502568 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

Discussed with Lars too.. We can backport this to 0.94.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059.patch, HBASE-6059-testcase.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287917#comment-13287917 ] 

Lars Hofhansl commented on HBASE-6059:
--------------------------------------

I don't grok the patch in all detail, but looks good, and same as trunk patch. So +1.
@Stack: Maybe you can have a safety look...?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280823#comment-13280823 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528568/HBASE-6059v3.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 34 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.client.TestShell
                  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
                  org.apache.hadoop.hbase.regionserver.TestStore
                  org.apache.hadoop.hbase.replication.TestMasterReplication

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1958//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1958//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1958//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280825#comment-13280825 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Chunhui
Can you run the entire testsuite?  HadoopQA will not run all the testcases.  Similarly for HBASE-6065.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286263#comment-13286263 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12530313/6059v7.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2067//testReport/
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2067//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280739#comment-13280739 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

@ram
Yes, I forget to consider TTL.
So we should create empty file if no kvs after minor or major compaction.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6059:
-------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Applied to trunk.  Thanks for the patch Chunhui and for all who helped get it in (Ram and Ted).
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059.patch

In the solution patch,  I use Map<byte[], Long> maxSeqIdInStores to save each store's maxSeqId,
So, when replaying edit logs, we skip the edits for different stores accoring to its own maxSeqId
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284416#comment-13284416 ] 

Zhihong Yu commented on HBASE-6059:
-----------------------------------

I would listen to opinion from people who are more familiar with store files about the current solution.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281919#comment-13281919 ] 

Zhihong Yu commented on HBASE-6059:
-----------------------------------

I ran TestSplitLogManager with patch v6 and it passed.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280077#comment-13280077 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Chunhui
This is a damn good one.  But still i find one problem is there in this.  A similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i delete the row 'r2' also
{code}
del = new Delete(Bytes.toBytes("r"));
    htable.delete(del);
    resultScanner = htable.getScanner(new Scan());
    count = 0;
    while (resultScanner.next() != null) {
      count++;
    }
{code}
Now my seq id from the store files will be 0 only as nothing to get after major compaction. So still the same problem is occuring.  I tried to simulate this with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a major compaction? Because as i see this problem that without major compaction there is no issue at all.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: 6059v7-94.patch

Updating patch for 0.94
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280785#comment-13280785 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528557/HBASE-6059v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 35 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestColumnSeeking

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1957//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1957//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1957//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6059:
-------------------------

    Attachment: 6059v7.txt
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Comment Edited] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280077#comment-13280077 ] 

ramkrishna.s.vasudevan edited comment on HBASE-6059 at 5/21/12 10:59 AM:
-------------------------------------------------------------------------

@Chunhui
This is a damn good one.  But still i find one problem is there in this.  A similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i delete the row 'r2' also
{edit}
In the same test case in the place where you are deleting the row 'r1' if i delete the row 'r' also
{edit}
{code}
del = new Delete(Bytes.toBytes("r"));
    htable.delete(del);
    resultScanner = htable.getScanner(new Scan());
    count = 0;
    while (resultScanner.next() != null) {
      count++;
    }
{code}
Now my seq id from the store files will be 0 only as nothing to get after major compaction. So still the same problem is occuring.  I tried to simulate this with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a major compaction? Because as i see this problem that without major compaction there is no issue at all.
                
      was (Author: ram_krish):
    @Chunhui
This is a damn good one.  But still i find one problem is there in this.  A similar type of problem that you have reported. Pls correct me if am wrong.
In the same test case in the place where you are deleting the row 'r1' if i delete the row 'r2' also
{code}
del = new Delete(Bytes.toBytes("r"));
    htable.delete(del);
    resultScanner = htable.getScanner(new Scan());
    count = 0;
    while (resultScanner.next() != null) {
      count++;
    }
{code}
Now my seq id from the store files will be 0 only as nothing to get after major compaction. So still the same problem is occuring.  I tried to simulate this with the same test case that you added. 
May be we need someother way to know that the edit has been deleted out by a major compaction? Because as i see this problem that without major compaction there is no issue at all.
                  
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284975#comment-13284975 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

@Lars
Can you have a look at this?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280704#comment-13280704 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

bq.If majorCompaction is false, we still need to check !kvs.isEmpty(), right?
Yes, I think just about majorCompaction, minorCompaction will retain delete type, there is no problem.

                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287295#comment-13287295 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

Ok..I can make it over the weekend.. 
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287288#comment-13287288 ] 

Lars Hofhansl commented on HBASE-6059:
--------------------------------------

Wanna make a patch Ram? I am technically on vacation.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6059:
------------------------------

    Fix Version/s: 0.96.0
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280703#comment-13280703 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

@ram
I think create empty store file is available and easy to solve the problem. Of course, we should pass the testcase.

If empty store file is not available, I think we could retain one or more kvs(such as, a delete type) in the majorcompaction.


                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281917#comment-13281917 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528774/6059v6.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 34 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.master.TestSplitLogManager
                  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
                  org.apache.hadoop.hbase.replication.TestMasterReplication

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1967//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1967//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1967//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280101#comment-13280101 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

@ram
Yes, I have also considered that all the entries in the store file is deleted and we don't write any new store file.
But, could we generate one empty store file with its meta data alone? Let me do a try first.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059-testcase.patch

I have written the test case to reproduce the issue
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281396#comment-13281396 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528693/HBASE-6059v4.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 34 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.client.TestShell
                  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
                  org.apache.hadoop.hbase.replication.TestMasterReplication
                  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1962//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1962//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1962//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281377#comment-13281377 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

@ram
Sorry, we don't have the test environment for trunk.
I think TestStore is failed by empty storefil, do we modify the testcase or the patch?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059v4.patch

Patch v4 modified TestStore#testDeleteExpiredStoreFiles since we could create empty store file now.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6059:
------------------------------

    Attachment: 6059v7.txt
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286273#comment-13286273 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

I run the failed test org.apache.hadoop.hbase.master.TestSplitLogManager.testOrphanTaskAcquisition on local PC, it passed.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286508#comment-13286508 ] 

Hudson commented on HBASE-6059:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #34 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/34/])
    HBASE-6059 Replaying recovered edits would make deleted data exist again (Revision 1344554)

     Result = FAILURE
stack : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Compactor.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
* /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java

                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281379#comment-13281379 ] 

Zhihong Yu commented on HBASE-6059:
-----------------------------------

I think test case should be modified.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059v5.patch

Find the bug which cause TestShell failed.

Store#rowAtOrBeforeFromStoreFile,
we should do it considering empty store file now, else it will throw NPW
{code}
Store#rowAtOrBeforeFromStoreFile
private void rowAtOrBeforeFromStoreFile(final StoreFile f,
                                          final GetClosestRowBeforeTracker state)
      throws IOException {
    StoreFile.Reader r = f.getReader();
    if (r == null) {
      LOG.warn("StoreFile " + f + " has a null Reader");
      return;
    }
     }
+    if (r.getEntries() == 0) {
+      LOG.warn("StoreFile " + f + " is a empty store file");
+      return;
+    }
    // TODO: Cache these keys rather than make each time?
    byte [] fk = r.getFirstKey();
{code}

Mmodify it in the patchV5 and passed the TestShell now.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059v2.patch

In the patchv2, fix the issue in the case: no kvs after compaction.(Using creating empty storefile mentioned by Ram )
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6059:
-------------------------

    Status: Open  (was: Patch Available)
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13286154#comment-13286154 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12530272/6059v7.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.master.TestAssignmentManager

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2063//testReport/
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2063//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280718#comment-13280718 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

I think only major compaction could lead us to this problem which probabaly deletes it.  
Incase of TTL expiry of all the entries in a store file, can we have this scenario of empty StoreFile getting created on minor or major compaction? I think creating empty store file should be fine.  Lets take others input also on this?
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6059:
------------------------------

    Attachment: 6059v7.txt

Patch v7 is rebased on trunk.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: 6059v6.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281502#comment-13281502 ] 

Hadoop QA commented on HBASE-6059:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528713/HBASE-6059v5.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 9 new or modified tests.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 34 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
                  org.apache.hadoop.hbase.replication.TestMasterReplication

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1963//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1963//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1963//console

This message is automatically generated.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280781#comment-13280781 ] 

Zhihong Yu commented on HBASE-6059:
-----------------------------------

Nice work.
Minor comments:
{code}
-    // Get minimum of the maxSeqId across all the store.
+    // Get the maxSeqId for each store.
{code}
The second line above seems redundant - same sentence appears later.
{code}
-    if (files == null || files.isEmpty()) return seqid;
+    if (files == null || files.isEmpty())
+      return seqid;
{code}
The above change is not necessary.
{code}
+    for (Map.Entry<byte[], Long> maxSeqIdInStore : maxSeqIdInStores.entrySet()) {
+      msg = msg + "; store=" + Bytes.toString(maxSeqIdInStore.getKey())
+          + ",minSequenceid=" + maxSeqIdInStore.getValue();
{code}
Do we really need the above loop (there could be many stores, making the log very long) ?
{code}
+      if (serverInfo != null && serverInfo.equals(destServer.getServerName()))
+        break;
{code}
break can be moved to the same line as if.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6059:
------------------------------------------

    Status: Patch Available  (was: Open)
    
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280088#comment-13280088 ] 

ramkrishna.s.vasudevan commented on HBASE-6059:
-----------------------------------------------

On major compaction if all the entries in the store file is deleted we don't write any new store file and hence this problem happens. To address this can we still have one empty store file but with its meta data alone atleast to get what was the seq id compacted that resulted in that store file.
This will help us to overcome the problem mentioned above? Pls provide your suggestions.
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-6059:
--------------------------------

    Attachment: HBASE-6059v3.patch

Patch v3 with Ted's comment,

And I run the TestColumnSeeking, it passed
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

Posted by "chunhui shen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287304#comment-13287304 ] 

chunhui shen commented on HBASE-6059:
-------------------------------------

@Lars @ram
I has done for 0.94~~
                
> Replaying recovered edits would make deleted data exist again
> -------------------------------------------------------------
>
>                 Key: HBASE-6059
>                 URL: https://issues.apache.org/jira/browse/HBASE-6059
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch
>
>
> When we replay recovered edits, we used the minSeqId of Store, It may cause deleted data appeared again.
> Let's see how it happens. Suppose the region with two families(cf1,cf2)
> 1.put one data to the region (put r1,cf1:q1,v1)
> 2.move the region from server A to server B.
> 3.delete the data put by step 1(delete r1)
> 4.flush this region.
> 5.make major compaction for this region
> 6.move the region from server B to server A.
> 7.Abort server A
> 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
> (When we replay recovered edits, we used the minSeqId of Store, because cf2 has no store files, so its seqId is 0, so the edit log of put data will be replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira