You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (Created) (JIRA)" <ji...@apache.org> on 2011/11/24 07:46:40 UTC
[jira] [Created] (HBASE-4862) Split hlog and open region currently
happend may cause data loss
Split hlog and open region currently happend may cause data loss
----------------------------------------------------------------
Key: HBASE-4862
URL: https://issues.apache.org/jira/browse/HBASE-4862
Project: HBase
Issue Type: Bug
Affects Versions: 0.90.2
Reporter: chunhui shen
Case Description:
1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
3.Split hlog thread catches the io exception, and stop parse this log file
and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
The case may happen in the following:
1.Move region from server A to server B
2.kill server A and Server B
3.restart server A and Server B
We could prevent this exception throuth forbiding deleting recover.edits file
which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ramkrishna.s.vasudevan updated HBASE-4862:
------------------------------------------
Fix Version/s: 0.92.0
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156965#comment-13156965 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Ted Yu @Todd Lipcon
It will happen concurrently in the following case:
1.Move region from server A to server B (for example,do balance)
2.kill server A and Server B
3.restart server A and Server B immediately
Before we restart server A and Server B, log data about this region appear in the both server's log file,
4.After we restart server B, serverShutdownHandler process this dead server , and assign this region,
5.At the same time, serverShutdownHandler would process dead server B, and split server B's hlog
because 4 and 5 is concurrent, replayRecoveredEditsIfAny in 4 and appending log entry for this region's
recoverd.edit file are concurrent. So, when the recoverd.edit file deleted by replayRecoveredEdits, exception is thrown.
master and region server log in this case as the following:
master log:
2011-11-16 11:50:13,037 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while writing log entry to log
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680 File does not exist. [Lease. Holder: DFSClient_hb_m_dw75.kgb.sqa.cm4:60000_1321413286871, pendingcreates: 54]
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1542)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1533)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1449)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1409)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
at org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:49)
at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:962)
at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:926)
at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:898)
regionserver log:
2011-11-16 11:49:49,727 ERROR org.apache.hadoop.hbase.regionserver.HRegion: Failed delete of hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680
2011-11-16 11:49:49,732 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Deleted recovered.edits file=hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156800103
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862-v6-90.txt
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.txt, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157385#comment-13157385 ]
Ted Yu commented on HBASE-4862:
-------------------------------
@Todd:
Do you need more details from Chunhui ?
Thanks
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Split hlog and open region
concurrently happend may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156765#comment-13156765 ]
Ted Yu commented on HBASE-4862:
-------------------------------
Nice work.
The patch doesn't apply to 0.90 branch:
{code}
Hunk #4 succeeded at 783 (offset -332 lines).
1 out of 4 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java.rej
...
patch unexpectedly ends in middle of line
2 out of 2 hunks ignored -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java.rej
{code}
Please rebase your patch for 0.90
A separate patch for TRUNK would be helpful for HadoopQA to run test suite.
Comments about the changes:
getTmpRecoveredEditsFileName() is only used once and there is no javadoc for it. Maybe we don't need to create the method, just append ".tmp" directly to the filename.
{code}
+ // Convert file name ends with .tmp, so ensure region's replayRecoveredEdits
{code}
The beginning of the above should read 'Append filename with '.tmp' to ensure'
> Split hlog and open region concurrently happend may cause data loss
> -------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v7fortrunk.patch
hbase-4862v7for0.90.patch
Based on patchV6,update javadoc of HLog#getSplitEditFilesSorted
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157339#comment-13157339 ]
Ted Yu commented on HBASE-4862:
-------------------------------
I could run test suite by executing 'mvn test' on MacBook.
PreCommit builds 371 and 373 didn't run any tests.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v3fortrunk.diff
hbase-4862v3for0.90.diff
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157680#comment-13157680 ]
Ted Yu commented on HBASE-4862:
-------------------------------
{code}
// Skip the test which creates a splitter that reads and writes the
// data without touching disk. testThreading#TestHLogSplit .etc
if (fs.exists(wap.p)) {
{code}
The javadoc should read:
{code}
// Skip the unit tests which create a splitter that reads and writes the
// data without touching disk. TestHLogSplit#testThreading is an example.
{code}
Specific test is represented by classname#testname
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Patch Available (was: Open)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158146#comment-13158146 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505285/hbase-4862v7fortrunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/388//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/388//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/388//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v1 for 0.90.diff
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159790#comment-13159790 ]
Hudson commented on HBASE-4862:
-------------------------------
Integrated in HBase-0.92 #163 (See [https://builds.apache.org/job/HBase-0.92/163/])
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen) move JIRA to 0.90 section in CHANGES.txt
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen)
tedyu :
Files :
* /hbase/branches/0.92/CHANGES.txt
tedyu :
Files :
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157652#comment-13157652 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Ted
I add testing to this patch in patchV5.
In the OS:Red Hat Enterprise Linux Server release 5.4 (Tikanga)
The test results is as the following:
For trunk with patchV5:
_
Results :
Failed tests: testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer): ReplicationPeer ZooKeeper session
was not properly expired.
testClosing(org.apache.hadoop.hbase.client.TestHCM)
Tests run: 1174, Failures: 2, Errors: 0, Skipped: 8
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:00:49.122s
[INFO] Finished at: Sun Nov 27 02:41:40 CST 2011
[INFO] Final Memory: 35M/361M
[INFO] ------------------------------------------------------------------------
_
For 0.90 with patchV5:
_
Results :
Tests run: 702, Failures: 0, Errors: 0, Skipped: 9
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:15:37.342s
[INFO] Finished at: Sun Nov 27 11:00:07 CST 2011
[INFO] Final Memory: 26M/525M
[INFO] ------------------------------------------------------------------------
_
The failed two tests In trunk are the same as the last run, one of which(testResetZooKeeperSession#TestReplicationPeer) could passed separately,
and the other is related to HBASE-4874
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Split hlog and open region
concurrently happend may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Summary: Split hlog and open region concurrently happend may cause data loss (was: Split hlog and open region currently happend may cause data loss)
> Split hlog and open region concurrently happend may cause data loss
> -------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862-0.92.txt
Patch for 0.92 branch.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: (was: 4862-v6-trunk.txt)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158178#comment-13158178 ]
Hudson commented on HBASE-4862:
-------------------------------
Integrated in HBase-TRUNK #2490 (See [https://builds.apache.org/job/HBase-TRUNK/2490/])
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen)
tedyu :
Files :
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157412#comment-13157412 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505180/hbase-4862v3fortrunk.diff
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/377//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/377//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/377//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Comment: was deleted
(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505283/4862-v6-trunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/387//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/387//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/387//console
This message is automatically generated.)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158347#comment-13158347 ]
Hudson commented on HBASE-4862:
-------------------------------
Integrated in HBase-TRUNK #2491 (See [https://builds.apache.org/job/HBase-TRUNK/2491/])
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen) Move JIRA to 0.90 section
tedyu :
Files :
* /hbase/trunk/CHANGES.txt
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ramkrishna.s.vasudevan updated HBASE-4862:
------------------------------------------
Fix Version/s: (was: 0.92.0)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157410#comment-13157410 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Ted
I has amend the patch again
Please check
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157367#comment-13157367 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505172/hbase-4862v1+for+trunk.diff
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/374//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/374//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/374//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158835#comment-13158835 ]
stack commented on HBASE-4862:
------------------------------
This is integrated. Can we close it?
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156967#comment-13156967 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
After successfully move region from server A to server B,
the log about this region in server A's log file is successful because flushed already,
but it affects other regions'log data in server A's log file if encounter this exception when split hlog
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157407#comment-13157407 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505178/hbase-4862v2fortrunk.diff
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/376//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/376//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/376//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157636#comment-13157636 ]
Jonathan Hsieh commented on HBASE-4862:
---------------------------------------
How feasible is it to add testing to this patch? Maybe simulate the failure situation by aborting RS's and then starting them like in the TestSplitTransactionOnCluster tests?
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157379#comment-13157379 ]
Ted Yu commented on HBASE-4862:
-------------------------------
Chunhui ran the patch through test suite.
The OS is:
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
{code}
Results :
Failed tests: testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer): ReplicationPeer ZooKeeper session
was not properly expired.
testClosing(org.apache.hadoop.hbase.client.TestHCM)
Tests run: 1173, Failures: 2, Errors: 0, Skipped: 8
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:02:44.930s
{code}
testClosing failure is captured in HBASE-4874.
TestReplicationPeer passed when run manually.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "shenchunhui (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157525#comment-13157525 ]
shenchunhui commented on HBASE-4862:
------------------------------------
Ted,
I find patch v3 make some failed test after changing fs.rename(wap.p, dst) to if (!fs.rename(wap.p, dst)) {
throw new IOException("Failed renaming " + wap.p + " to " + dst);
}
I will amend it , and give you test results later
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158141#comment-13158141 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505283/4862-v6-trunk.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/387//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/387//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/387//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158177#comment-13158177 ]
Ted Yu commented on HBASE-4862:
-------------------------------
Integrated to 0.90, 0.92 branches and TRUNK.
Thanks for the patch Chunhui.
Thanks for the review Jonathan.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4862) Splitting hlog and
opening region concurrently may cause data loss
Posted by "Ted Yu (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157678#comment-13157678 ]
Ted Yu edited comment on HBASE-4862 at 11/27/11 6:50 AM:
---------------------------------------------------------
@Jonathan
bq. What happens if the .temp gets left behind without being renamed?
If the .temp file gets left behind, it means the log splitting failed, and the .temp file would be deleted in the next log splitting.
You could find that, for the same split hlog file, it creates the same filename in the region's recoverd.edits directory
Thanks for your suggestion.
was (Author: zjushch):
@Jonathan
What happens if the .temp gets left behind without being renamed?
If the the .temp gets left ,it means the spliting log is failed, and the .temp file would be deleted in the next spliting log.
You could find that, for the same splitted hlog file, it creates the same name file in the region's recoverd.edits directory
Thanks for your suggestion.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156858#comment-13156858 ]
Todd Lipcon commented on HBASE-4862:
------------------------------------
wait, wait -- _why_ is this happening concurrently? A region should never be opened until the split process is done for that region. If this is happening we have a much larger issue, which we shouldn't be working around with tmp file names, etc.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157411#comment-13157411 ]
Ted Yu commented on HBASE-4862:
-------------------------------
+1 on patch v3.
Please run patch for 0.90 through test suite and let us know the results.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Summary: Splitting hlog and opening region concurrently may cause data loss (was: Split hlog and open region concurrently happend may cause data loss)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862-v6-trunk.patch
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862.txt
I ran a few tests based on patch for TRUNK and didn't see failure.
Reattaching patch for TRUNK.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v2fortrunk.diff
hbase-4862v2for0.90.diff
@Ted
I has amend the patch
Please check.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862.txt
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Split hlog and open region currently
happend may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: 4862.patch
Split hlog :Add suffix ".tmp" for file in the recoverd.edits directory when creating,
and rename it without the suffix after close;
ReplayRecoveredEditsIfAny: skip the file whose name ends with .tmp
> Split hlog and open region currently happend may cause data loss
> ----------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Patch Available (was: Open)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Priority: Critical (was: Major)
Lifting priority as Ramkrishna suggested.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Open (was: Patch Available)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Split hlog and open region
concurrently happend may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Fix Version/s: 0.90.5
0.94.0
0.92.0
> Split hlog and open region concurrently happend may cause data loss
> -------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v1 for trunk.diff
hbase-4862v1 for 0.90.diff
Grant license to ASF for the attached patch
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157250#comment-13157250 ]
Ted Yu commented on HBASE-4862:
-------------------------------
Log snippets from Chunhui.
Region C was 3591e9867a4c125493dc82168854ea0c
File F was 0000000013156791680
Master log:
{code}
2011-11-16 11:47:23,134 INFO org.apache.hadoop.hbase.master.ServerManager:
Triggering server recovery; existingServer serverB,60020,1321415172631 looks stale
2011-11-16 11:47:23,134 DEBUG org.apache.hadoop.hbase.master.ServerManager:
Added=serverB,60020,1321415172631 to dead servers, submitted shutdown handler to be executed, root=false, meta=true
2011-11-16 11:47:29,305 INFO org.apache.hadoop.hbase.master.ServerManager:
Triggering server recovery; existingServer serverA,60020,1321415179549 looks stale
2011-11-16 11:47:29,305 DEBUG org.apache.hadoop.hbase.master.ServerManager:
Added=serverA,60020,1321415179549 to dead servers, submitted shutdown handler to be executed, root=false, meta=false
2011-11-16 11:48:28,700 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
Splitting 28 hlog(s) in hdfs://serverX:9000/hbase-common/.logs/serverB,60020,1321414043798
2011-11-16 11:48:30,657 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
Creating writer path=hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156800103 region=3591e9867a4c125493dc82168854ea0c
2011-11-16 11:49:17,855 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
Closed path hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156800103 (wrote 75875 edits in 3228ms)
2011-11-16 11:49:19,629 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
Splitting 28 hlog(s) in hdfs://serverX:9000/hbase-common/.logs/serverA,60020,1321414056134
2011-11-16 11:49:20,650 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter:
Creating writer path=hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680 region=3591e9867a4c125493dc82168854ea0c
2011-11-16 11:49:36,731 DEBUG org.apache.hadoop.hbase.master.AssignmentManager:
Assigning region writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c. to serverD,60020,1321415224381
2011-11-16 11:49:49,755 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c. on serverD,60020,1321415224381
2011-11-16 11:50:13,030 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680 File does not exist.
2011-11-16 11:50:13,037 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while writing log entry to log
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680 File does not exist.
2011-11-16 11:50:13,051 ERROR org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting hdfs://serverX:9000/hbase-common/.logs/serverA,60020,1321414056134
{code}
Log from region server D:
{code}
2011-11-16 11:49:36,730 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open region: writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.
2011-11-16 11:49:49,727 ERROR org.apache.hadoop.hbase.regionserver.HRegion:
Failed delete of hdfs://serverX:9000/hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680
2011-11-16 11:49:49,733 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Onlined writetest1,19ILNKUHRKQ3BT0FLC9CMVWBP2ZPRV4W7XYA491BE6ZS2JE9132BO5GABIHNJHDU79TXBA4OOAP8OEIVTQ0PDHZB26QI5XHY17BK,1321267032810.3591e9867a4c125493dc82168854ea0c.; next sequenceid=13160672878
{code}
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Jonathan Hsieh (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157671#comment-13157671 ]
Jonathan Hsieh commented on HBASE-4862:
---------------------------------------
@chenhui
I have a question and a few nits.
What happens if the .temp gets left behind without being renamed?
You might want to mention that hlogs files in progress (.temp file suffixed) are excluded here.
{code}
+ // After creating writer, simulate partial region's
+ // replayRecoveredEditsIfAny() which gets SplitEditFiles of this
+ // region,and delete them.
{code}
Also, probably want to update javadoc of getSplitEditFilesSorted.
Comment should probably be "most likely" instead of "mostly"
{code}
+ try{
+ logSplitter.splitLog();
+ } catch (IOException e) {
+ LOG.info(e);
+ Assert.fail("Throws IOException when spliting "
+ + "log, it is mostly because writing file does not "
+ + "exist which is caused by concurrent replayRecoveredEditsIfAny()");
+ }
+ if (fs.exists(corruptDir)) {
+ if (fs.listStatus(corruptDir).length > 0) {
+ Assert.fail("There are some corrupt logs, "
+ + "it is mostly caused by concurrent replayRecoveredEditsIfAny()");
+ }
+ }
+ }
{code}
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157270#comment-13157270 ]
ramkrishna.s.vasudevan commented on HBASE-4862:
-----------------------------------------------
If the scenario is valid do we need to up the priority of this defect? But may not be common.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: (was: 4862.txt)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157388#comment-13157388 ]
Ted Yu commented on HBASE-4862:
-------------------------------
{code}
+ if (fileName.endsWith(HLog.RECOVERED_LOG_TMPFILE_SUFFIX))
+ fileName = fileName.split(HLog.RECOVERED_LOG_TMPFILE_SUFFIX)[0];
{code}
Please enclose the second line above in curly braces.
w.r.t. fs.rename() call, here is javadoc from ClientProtocol.rename(which is called by fs.rename):
{code}
* @return true if successful, or false if the old name does not exist
* or if the new name already belongs to the namespace.
{code}
We should check the return value along with catching exception.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156991#comment-13156991 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Ted @Todd
I'm sorry my explanation is not clear.
I think I should descibe the detailed case first.
In the whole following process , client's putting data to region C.
1.Sucessfully move region C from server A to server B,
At the moment,there is log entry about region C in both server A's log file and server B's log file
2.kill server A and server B,
3.restart server B,
Now, mastet start serverShutdownHanlder for server B, and assign the region C to server D
4,Before region C is opend on the server D,restart server A
Now,mastet start serverShutdownHanlder for server A, and split server A's log file.
Because there is log entry about region C in server A's log file (why? see 1), split hlog thread would create a file F in the region C's recovered.edits directory.
5.In region C opening process, it will execute replayRecoveredEdits(),and then delete file F.
6.Therefore,in the 4, it throws IO Exception that file F not exists, and cause stopping parse the current server A's hlog file, however, other data in this server A's hlog file lossed
The posted region server log is server B's log, and it is doing replayRecoveredEditsIfAny(). Although it prints failed delete of file recovered.edits/0000000013156791680, but in fact this file has been deleted, and master throws file not exist exception :
2011-11-16 11:50:13,037 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-1 Got while writing log entry to log org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /hbase-common/writetest1/3591e9867a4c125493dc82168854ea0c/recovered.edits/0000000013156791680 File does not exist.
I'm not sure whether you are clear now, waiting for your question.
Thanks!
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156866#comment-13156866 ]
Ted Yu commented on HBASE-4862:
-------------------------------
@Chunhui:
Can you attach master and region server log snippets which would show us what happened ?
Thanks
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v1 for trunk.diff
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4862) Split hlog and open region
concurrently happend may cause data loss
Posted by "Ted Yu (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu reassigned HBASE-4862:
-----------------------------
Assignee: chunhui shen
> Split hlog and open region concurrently happend may cause data loss
> -------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157289#comment-13157289 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505060/hbase-4862v1+for+trunk.diff
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.master.TestRollingRestart
org.apache.hadoop.hbase.master.TestRestartCluster
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
org.apache.hadoop.hbase.regionserver.wal.TestHLogBench
org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
org.apache.hadoop.hbase.regionserver.TestAtomicOperation
org.apache.hadoop.hbase.TestInfoServers
org.apache.hadoop.hbase.regionserver.TestParallelPut
org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
org.apache.hadoop.hbase.TestRegionRebalancing
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed
org.apache.hadoop.hbase.ipc.TestDelayedRpc
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
org.apache.hadoop.hbase.regionserver.wal.TestWALReplay
org.apache.hadoop.hbase.master.TestHMasterRPCException
org.apache.hadoop.hbase.regionserver.TestHRegion
org.apache.hadoop.hbase.client.TestMultipleTimestamps
org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.client.TestMetaScanner
org.apache.hadoop.hbase.master.TestMaster
org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
org.apache.hadoop.hbase.TestDrainingServer
org.apache.hadoop.hbase.regionserver.TestSplitLogWorker
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion
org.apache.hadoop.hbase.avro.TestAvroServer
org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit
org.apache.hadoop.hbase.thrift.TestThriftServer
org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics
org.apache.hadoop.hbase.master.TestMasterFailover
org.apache.hadoop.hbase.regionserver.wal.TestHLog
org.apache.hadoop.hbase.TestMultiVersions
org.apache.hadoop.hbase.master.TestMasterTransitions
org.apache.hadoop.hbase.master.TestSplitLogManager
org.apache.hadoop.hbase.master.TestOpenedRegionHandler
org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/369//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/369//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/369//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157918#comment-13157918 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505251/4862-v6-trunk.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/379//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/379//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/379//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.txt, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ramkrishna.s.vasudevan updated HBASE-4862:
------------------------------------------
Fix Version/s: 0.92.0
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-4862) Splitting hlog and
opening region concurrently may cause data loss
Posted by "Ted Yu (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157652#comment-13157652 ]
Ted Yu edited comment on HBASE-4862 at 11/27/11 5:50 AM:
---------------------------------------------------------
@Ted
I add testing to this patch in patchV5.
In the OS:Red Hat Enterprise Linux Server release 5.4 (Tikanga)
The test results is as the following:
For trunk with patchV5:
_
Results :
Failed tests: testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer): ReplicationPeer ZooKeeper session
was not properly expired.
testClosing(org.apache.hadoop.hbase.client.TestHCM)
Tests run: 1174, Failures: 2, Errors: 0, Skipped: 8
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:00:49.122s
[INFO] Finished at: Sun Nov 27 02:41:40 CST 2011
[INFO] Final Memory: 35M/361M
[INFO] ------------------------------------------------------------------------
_
For 0.90 with patchV5:
_
Results :
Tests run: 702, Failures: 0, Errors: 0, Skipped: 9
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:15:37.342s
[INFO] Finished at: Sun Nov 27 11:00:07 CST 2011
[INFO] Final Memory: 26M/525M
[INFO] ------------------------------------------------------------------------
_
The failed two tests In trunk are the same as the last run, one of which(TestReplicationPeer#testResetZooKeeperSession) could pass separately and the other is related to HBASE-4874
was (Author: zjushch):
@Ted
I add testing to this patch in patchV5.
In the OS:Red Hat Enterprise Linux Server release 5.4 (Tikanga)
The test results is as the following:
For trunk with patchV5:
_
Results :
Failed tests: testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer): ReplicationPeer ZooKeeper session
was not properly expired.
testClosing(org.apache.hadoop.hbase.client.TestHCM)
Tests run: 1174, Failures: 2, Errors: 0, Skipped: 8
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:00:49.122s
[INFO] Finished at: Sun Nov 27 02:41:40 CST 2011
[INFO] Final Memory: 35M/361M
[INFO] ------------------------------------------------------------------------
_
For 0.90 with patchV5:
_
Results :
Tests run: 702, Failures: 0, Errors: 0, Skipped: 9
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:15:37.342s
[INFO] Finished at: Sun Nov 27 11:00:07 CST 2011
[INFO] Final Memory: 26M/525M
[INFO] ------------------------------------------------------------------------
_
The failed two tests In trunk are the same as the last run, one of which(testResetZooKeeperSession#TestReplicationPeer) could passed separately,
and the other is related to HBASE-4874
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Patch Available (was: Open)
TestHLogSplit passed on MacBook.
Rerun test suite on Jenkins.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158266#comment-13158266 ]
Hudson commented on HBASE-4862:
-------------------------------
Integrated in HBase-0.92-security #20 (See [https://builds.apache.org/job/HBase-0.92-security/20/])
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen) move JIRA to 0.90 section in CHANGES.txt
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen)
tedyu :
Files :
* /hbase/branches/0.92/CHANGES.txt
tedyu :
Files :
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157678#comment-13157678 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Jonathan
What happens if the .temp gets left behind without being renamed?
If the the .temp gets left ,it means the spliting log is failed, and the .temp file would be deleted in the next spliting log.
You could find that, for the same splitted hlog file, it creates the same name file in the region's recoverd.edits directory
Thanks for your suggestion.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Open (was: Patch Available)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Patch Available (was: Open)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158151#comment-13158151 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505287/4862-0.92.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/389//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158321#comment-13158321 ]
Hudson commented on HBASE-4862:
-------------------------------
Integrated in HBase-TRUNK-security #12 (See [https://builds.apache.org/job/HBase-TRUNK-security/12/])
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen) Move JIRA to 0.90 section
HBASE-4862 Splitting hlog and opening region concurrently may cause data loss
(Chunhui Shen)
tedyu :
Files :
* /hbase/trunk/CHANGES.txt
tedyu :
Files :
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Status: Open (was: Patch Available)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157919#comment-13157919 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505252/4862-v6-90.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/380//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-90.txt, 4862-v6-trunk.txt, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157341#comment-13157341 ]
Ted Yu commented on HBASE-4862:
-------------------------------
When attaching patch, please grant license to ASF.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chunhui shen updated HBASE-4862:
--------------------------------
Attachment: hbase-4862v5fortrunk.diff
hbase-4862v5for0.90.diff
Add a test case in patchv5
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Comment: was deleted
(was: -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505162/4862.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/371//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/371//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/371//console
This message is automatically generated.)
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157406#comment-13157406 ]
Ted Yu commented on HBASE-4862:
-------------------------------
Thanks for the quick turnaround.
{code}
+ throw new IOException("Failed rename " + wap.p + " to " + dst);
{code}
The above should read 'Failed renaming '.
For HLog.java:
{code}
+ if (p.getName().endsWith(RECOVERED_LOG_TMPFILE_SUFFIX))
+ result = false;
{code}
Please add curly braces for the above as well.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157327#comment-13157327 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505162/4862.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/371//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/371//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/371//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157334#comment-13157334 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505167/4862.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/373//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/373//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/373//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157638#comment-13157638 ]
chunhui shen commented on HBASE-4862:
-------------------------------------
@Jonathan
I think we could add testing to this patch through doing region's replayrecoverdedit after creating writer when doing splitlog.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13157662#comment-13157662 ]
Hadoop QA commented on HBASE-4862:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12505225/hbase-4862v5fortrunk.diff
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
-1 javadoc. The javadoc tool appears to have generated -162 warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 67 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.master.TestRollingRestart
org.apache.hadoop.hbase.util.TestRegionSplitter
org.apache.hadoop.hbase.client.TestMultiParallel
org.apache.hadoop.hbase.master.TestRestartCluster
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
org.apache.hadoop.hbase.client.TestInstantSchemaChange
org.apache.hadoop.hbase.regionserver.wal.TestHLogBench
org.apache.hadoop.hbase.rest.TestGzipFilter
org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD
org.apache.hadoop.hbase.regionserver.TestAtomicOperation
org.apache.hadoop.hbase.rest.TestScannersWithFilters
org.apache.hadoop.hbase.TestInfoServers
org.apache.hadoop.hbase.regionserver.TestParallelPut
org.apache.hadoop.hbase.coprocessor.TestClassLoading
org.apache.hadoop.hbase.client.TestAdmin
org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
org.apache.hadoop.hbase.filter.TestColumnRangeFilter
org.apache.hadoop.hbase.mapred.TestTableInputFormat
org.apache.hadoop.hbase.client.TestHCM
org.apache.hadoop.hbase.regionserver.TestStoreFileBlockCacheSummary
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole
org.apache.hadoop.hbase.coprocessor.TestMasterObserver
org.apache.hadoop.hbase.rest.TestStatusResource
org.apache.hadoop.hbase.TestRegionRebalancing
org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
org.apache.hadoop.hbase.rest.TestVersionResource
org.apache.hadoop.hbase.client.TestScannerTimeout
org.apache.hadoop.hbase.client.TestFromClientSide
org.apache.hadoop.hbase.regionserver.TestFSErrorsExposed
org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol
org.apache.hadoop.hbase.rest.TestRowResource
org.apache.hadoop.hbase.rest.TestScannerResource
org.apache.hadoop.hbase.ipc.TestDelayedRpc
org.apache.hadoop.hbase.rest.client.TestRemoteAdmin
org.apache.hadoop.hbase.util.TestFSUtils
org.apache.hadoop.hbase.master.TestDistributedLogSplitting
org.apache.hadoop.hbase.rest.TestTableResource
org.apache.hadoop.hbase.regionserver.wal.TestWALReplay
org.apache.hadoop.hbase.util.TestIdLock
org.apache.hadoop.hbase.catalog.TestCatalogTrackerOnCluster
org.apache.hadoop.hbase.rest.TestTransform
org.apache.hadoop.hbase.coprocessor.TestCoprocessorEndpoint
org.apache.hadoop.hbase.client.TestInstantSchemaChangeSplit
org.apache.hadoop.hbase.regionserver.TestHRegion
org.apache.hadoop.hbase.client.TestMultipleTimestamps
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.client.TestMetaScanner
org.apache.hadoop.hbase.io.hfile.TestHFileBlock
org.apache.hadoop.hbase.client.TestTimestampsFilter
org.apache.hadoop.hbase.client.TestInstantSchemaChangeFailover
org.apache.hadoop.hbase.client.TestShell
org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTable
org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
org.apache.hadoop.hbase.rest.TestSchemaResource
org.apache.hadoop.hbase.TestAcidGuarantees
org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
org.apache.hadoop.hbase.avro.TestAvroServer
org.apache.hadoop.hbase.rest.client.TestRemoteTable
org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol
org.apache.hadoop.hbase.util.TestHBaseFsck
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove
org.apache.hadoop.hbase.client.TestHTableUtil
org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit
org.apache.hadoop.hbase.thrift.TestThriftServer
org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface
org.apache.hadoop.hbase.util.TestMergeTool
org.apache.hadoop.hbase.regionserver.TestRegionServerMetrics
org.apache.hadoop.hbase.util.TestMergeTable
org.apache.hadoop.hbase.master.TestMasterFailover
org.apache.hadoop.hbase.regionserver.wal.TestHLog
org.apache.hadoop.hbase.rest.TestMultiRowResource
org.apache.hadoop.hbase.TestMultiVersions
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.master.TestMasterTransitions
org.apache.hadoop.hbase.master.TestSplitLogManager
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithRemove
org.apache.hadoop.hbase.coprocessor.TestWALObserver
org.apache.hadoop.hbase.TestZooKeeper
org.apache.hadoop.hbase.master.TestOpenedRegionHandler
org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithAbort
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/378//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/378//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/378//console
This message is automatically generated.
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4862) Splitting hlog and opening region
concurrently may cause data loss
Posted by "Ted Yu (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4862:
--------------------------
Attachment: 4862-v6-trunk.txt
Patch v6 with javadoc updated according to reviews
> Splitting hlog and opening region concurrently may cause data loss
> ------------------------------------------------------------------
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862-v6-trunk.txt, 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 and is appending log entry
> 2.Regionserver is opening region A now, and in the process replayRecoveredEditsIfAny() ,it will delete the file region A/recoverd.edits/123456
> 3.Split hlog thread catches the io exception, and stop parse this log file
> and if skipError = true , add it to the corrupt logs....However, data in other regions in this log file will loss
> 4.Or if skipError = false, it will check filesystem.Of course, the file system is ok , and it only prints a error log, continue assigning regions. Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting recover.edits file
> which is appending by split hlog thread
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira