You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org> on 2012/05/18 17:43:06 UTC

[jira] [Created] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

ramkrishna.s.vasudevan created HBASE-6050:
---------------------------------------------

             Summary: HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
                 Key: HBASE-6050
                 URL: https://issues.apache.org/jira/browse/HBASE-6050
             Project: HBase
          Issue Type: Bug
            Reporter: ramkrishna.s.vasudevan


The scenario is like this
-> A region is getting splitted.
-> The master is still not processed the split .
-> Region server goes down.
-> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
-> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
-> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
There if the regiondir doesnot exist we tend to create and then add the recovered.edits.

Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
Ideally cluster is fine but we it is misleading.
{code}
        } else {
          Path dstdir = dst.getParent();
          if (!fs.exists(dstdir)) {
            if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
          }
        }
        fs.rename(src, dst);
        LOG.debug(" moved " + src + " => " + dst);
      } else {
        LOG.debug("Could not move recovered edits from " + src +
            " as it doesn't exist");
      }
    }
    archiveLogs(null, corruptedLogs, processedLogs,
        oldLogDir, fs, conf);
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281686#comment-13281686 ] 

Zhihong Yu commented on HBASE-6050:
-----------------------------------

Patch looks good.
Minor:
Please insert spaces around regionDir:
{code}
+            " to destination " +regionDir+ " as it doesn't exist.");
{code}
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl closed HBASE-6050.
--------------------------------

    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.94.1, 0.96.0
>
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6050:
------------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)
    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6050:
------------------------------------------

    Status: Patch Available  (was: Open)
    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284091#comment-13284091 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #18 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/18/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. (Ram) (Revision 1342937)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279590#comment-13279590 ] 

Hadoop QA commented on HBASE-6050:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12528231/HBASE-6050.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 hadoop23.  The patch compiles against the hadoop 0.23.x profile.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 33 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.replication.TestReplication
                  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
                  org.apache.hadoop.hbase.replication.TestMasterReplication

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1942//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1942//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1942//console

This message is automatically generated.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284044#comment-13284044 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-0.94 #221 (See [https://builds.apache.org/job/HBase-0.94/221/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. (Ram) (Revision 1342934)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287181#comment-13287181 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-0.94-security #33 (See [https://builds.apache.org/job/HBase-0.94-security/33/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. (Ram) (Revision 1342934)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284056#comment-13284056 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-0.92 #424 (See [https://builds.apache.org/job/HBase-0.92/424/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. and a small addendum for HBASE-6002 (Ram) (Revision 1342935)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Jonathan Hsieh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283412#comment-13283412 ] 

Jonathan Hsieh commented on HBASE-6050:
---------------------------------------

Just for clarification - this edits are actually replayed to the daughter regions and these recovered.edits files are kept around for something (the CJ?) to eventually clean up?
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284027#comment-13284027 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Committed to trunk, 0.94 and 0.92.
Thanks for review Ted and Jon.
Thanks Stack for your idea.  
P.S. committed a small addendum for HBASE-6002 for 0.92 only as both were part of HLogSplitter.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283452#comment-13283452 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

@Jon
In our case the split got completed and the RS went down due to ZK issue and that is why the Master was not able to respond to the split region completion.  Because the RS went down the recovered.edits creation came into play.
Ideally CJ just cleans up the entire region directory because the parent is in splitted state and offlined.  Also in this case as the split is completed we are sure that the data is also flushed to store files. Daughter regions will have its own region directory.
Did i answer your question? ;)
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13281680#comment-13281680 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Pls share your comments on this patch? If it is ok i can prepare for other versions also.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6050:
------------------------------------------

    Affects Version/s: 0.92.1
                       0.94.0
        Fix Version/s: 0.94.1
                       0.96.0
                       0.92.2
    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279029#comment-13279029 ] 

stack commented on HBASE-6050:
------------------------------

Good one Ram.

So, we are talking about the parent region?

It does seem wrong that we would recreate a parent region dir in the distributed log splitter.

How about we remove that dir creation code?  I can see our making the recovered.edits dir because it won't always be there but creating all of its parent dirs is not right.  My guess is that the mkdirs was done because it was just easier than verifying parent dir present.

If parent dir not present, log the fact that there is no target region into which to put the edits and move on I'd say.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279592#comment-13279592 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Replication related testcases are failing in the previous few QA builds.
So this patch has not introduced it.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-6050:
------------------------------------------

    Attachment: HBASE-6050.patch

Trunk patch.  Pls provide your comments.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287240#comment-13287240 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-0.92-security #109 (See [https://builds.apache.org/job/HBase-0.92-security/109/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. and a small addendum for HBASE-6002 (Ram) (Revision 1342935)

     Result = SUCCESS
ramkrishna : 
Files : 
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.96.0, 0.94.1
>
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13282174#comment-13282174 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Thanks Ted.  Will prepare patches for 0.92 and 0.94 and commit them later today in the evening if there is no objection.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278881#comment-13278881 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Why are we trying to create the dstdir? What is the reason for it?
Is the fix to be applied here or on the HBCK side so that he does not think that there is some inconsistency?
But if we make this change in HBCK we are not sure how to delete the recovered.edits file created because master will never try to open this region?
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283663#comment-13283663 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

I will commit this tomorrow morning.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan reassigned HBASE-6050:
---------------------------------------------

    Assignee: ramkrishna.s.vasudevan
    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.

Posted by "Zhihong Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-6050:
------------------------------

    Hadoop Flags: Reviewed
         Summary: HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.  (was: HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.)
    
> HLogSplitter renaming recovered.edits and CJ removing the parent directory race, making the HBCK think cluster is inconsistent.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284035#comment-13284035 ] 

Hudson commented on HBASE-6050:
-------------------------------

Integrated in HBase-TRUNK #2925 (See [https://builds.apache.org/job/HBase-TRUNK/2925/])
    HBASE-6050 HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent. (Ram) (Revision 1342937)

     Result = FAILURE
ramkrishna : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java

                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280192#comment-13280192 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

Pls share your comments on this patch?
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>         Attachments: HBASE-6050.patch
>
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-6050) HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279033#comment-13279033 ] 

ramkrishna.s.vasudevan commented on HBASE-6050:
-----------------------------------------------

bq.So, we are talking about the parent region?
Yes it is the parent region.
bq.If parent dir not present, log the fact that there is no target region into which to put the edits and move on I'd say
Yes if destination does not exist we can move one and so we will consider the log splitting process successful.
But the file created in the splitlog folder by the distributed log splitting will never be cleared i think.? May be i need to check the code on that. I will come up with a patch on this tomorrow.
                
> HLogSplitter renaming recovered.edits and CJ removing the parent directory races, making the HBCK to think cluster is inconsistent.
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6050
>                 URL: https://issues.apache.org/jira/browse/HBASE-6050
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ramkrishna.s.vasudevan
>
> The scenario is like this
> -> A region is getting splitted.
> -> The master is still not processed the split .
> -> Region server goes down.
> -> Split log manager starts splitting the logs and creates the recovered.edits in the splitlog path.
> -> CJ starts and deletes the entry from META and also just completes the deletion of the region dir.
> -> in hlogSplitter on final step we rename the recovered.edits to come under the regiondir.
> There if the regiondir doesnot exist we tend to create and then add the recovered.edits.
> Because of this HBCK thinks it to be an orphan region because we have the regiondir but with no regioninfo.
> Ideally cluster is fine but we it is misleading.
> {code}
>         } else {
>           Path dstdir = dst.getParent();
>           if (!fs.exists(dstdir)) {
>             if (!fs.mkdirs(dstdir)) LOG.warn("mkdir failed on " + dstdir);
>           }
>         }
>         fs.rename(src, dst);
>         LOG.debug(" moved " + src + " => " + dst);
>       } else {
>         LOG.debug("Could not move recovered edits from " + src +
>             " as it doesn't exist");
>       }
>     }
>     archiveLogs(null, corruptedLogs, processedLogs,
>         oldLogDir, fs, conf);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira