You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2010/06/15 02:38:14 UTC

[jira] Created: (HBASE-2728) Support for HADOOP-4829

Support for HADOOP-4829
-----------------------

                 Key: HBASE-2728
                 URL: https://issues.apache.org/jira/browse/HBASE-2728
             Project: HBase
          Issue Type: Bug
            Reporter: Jean-Daniel Cryans
            Assignee: Jean-Daniel Cryans
             Fix For: 0.20.6


Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879165#action_12879165 ] 

stack commented on HBASE-2728:
------------------------------

An issue I have is that RS needs to make it so tests can override shutdown of FS else when we kill or stop RSs during tests, they'll pull the FS out from under everything else that is currently running in a minihbasecluster.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans resolved HBASE-2728.
---------------------------------------

     Hadoop Flags: [Reviewed]
    Fix Version/s: 0.20.5
                       (was: 0.20.6)
       Resolution: Fixed

Committed to branch, meaning it's in 0.20.5 since we're doing a RC4.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.5
>
>         Attachments: HBASE-2728-v2.patch, HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2728:
--------------------------------------

    Attachment: HBASE-2728.patch

Patch that supports both cases.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878819#action_12878819 ] 

Todd Lipcon commented on HBASE-2728:
------------------------------------

Here's a thought. What if we did something like:
FileSystem fs = FileSystem.get(conf);
fs.close();
fs = FileSystem.get(conf); // now we know we have a fresh instance?

Or, if we're certain we always get the filesystem instance from this var, we could use FileSystem.newInstance?

My only nervousness is that this might break again and we wouldn't notice (since we don't seem to have any good unit tests for it)

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879540#action_12879540 ] 

stack commented on HBASE-2728:
------------------------------

OK.  Sounds good.  I'm +1 on applying this to the branch (after making above changes).

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878813#action_12878813 ] 

Todd Lipcon commented on HBASE-2728:
------------------------------------

Only question: are we sure that that Filesystem.get is the first instantiation of the filesystem? If the filesystem is already in the Cache, then setting the conf var at this point won't do anything.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879558#action_12879558 ] 

stack commented on HBASE-2728:
------------------------------

+1

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728-v2.patch, HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878815#action_12878815 ] 

Jean-Daniel Cryans commented on HBASE-2728:
-------------------------------------------

I did look around and this is currently the case. This is in the very first steps of RS instantiation.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879187#action_12879187 ] 

Jean-Daniel Cryans commented on HBASE-2728:
-------------------------------------------

I tested my patch against both on a fully distributed 1 node setup, not with unit tests.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879185#action_12879185 ] 

stack commented on HBASE-2728:
------------------------------

We could but it'd be a lot of work to do on branch.  In trunk we do this for the newer tests.

In fact my comment is a little confused on revisit.  This is for branch, not for trunk so reviewing j-ds' patch in that light, I don't think it so bad.  The Todd suggestion of closing the fs then reopening to be sure we have a good FS instance will for sure mangle tests.  Did you test it against apache and cloudera hadoop j-d?



> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879252#action_12879252 ] 

stack commented on HBASE-2728:
------------------------------

I looked at this patch again:

{code}
-      Runtime.getRuntime().removeShutdownHook(hdfsClientFinalizer);
+      boolean registered =
+          Runtime.getRuntime().removeShutdownHook(hdfsClientFinalizer);
+      if (!registered) {
+        LOG.info("The HDFS shutdown hook isn't where we expect it, " +
+            "will call close during shutdown");
+        hdfsSupportsAutoCloseDisabling = true;
+      }
{code}

In above, I'd say you should do better explaination in log message.. mention that you are going to presume fs.automatic.close is in place.

Do you think it would pay to do better introspection up earlier in this method looking for hdfsClientFinalizer explicitly in Cache -- then you'd know you have an hdfs w/ hadoop-4829 in place?  Maybe not . Maybe thats what I should do on trunk and this is good enough for branch, presuming all tests pass on both apache and cloudera?

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2728:
--------------------------------------

    Attachment: HBASE-2728-v2.patch

Patch that works on both releases that I will commit once 0.20.5 gets released (soon I hope).

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728-v2.patch, HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879168#action_12879168 ] 

Todd Lipcon commented on HBASE-2728:
------------------------------------

For tests, can't we use FileSystem.newInstance so each one gets their own?

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2728) Support for HADOOP-4829

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879491#action_12879491 ] 

Jean-Daniel Cryans commented on HBASE-2728:
-------------------------------------------

Running against cloudera's release (that would also be the same with any patched hdfs), TestRegionRebalancing fails. The issue is that we do something very dirty in HRS for tests: 

{code} 
  /** 
   * Set the hdfs shutdown thread to run on exit. Pass null to disable 
   * running of the shutdown test. Needed by tests. 
   * @param t Thread to run. Pass null to disable tests. 
   * @return Previous occupant of the shutdown thread position. 
   */ 
  public Thread setHDFSShutdownThreadOnExit(final Thread t) { 
    Thread old = this.hdfsShutdownThread; 
    this.hdfsShutdownThread = t; 
    return old; 
  } 
{code} 

So the tests pass t=null. but if we don't use the thread then we still shutdown HDFS. The clean solution is to set shutdownHDFS.set(false) in that method and the check already in place will do the work.

> Support for HADOOP-4829
> -----------------------
>
>                 Key: HBASE-2728
>                 URL: https://issues.apache.org/jira/browse/HBASE-2728
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.6
>
>         Attachments: HBASE-2728.patch
>
>
> Users who have a HADOOP-4829 patched hadoop will run into the issue that closing a RS cleanly result into data loss because the FileSystem will be closed before the regions are. Cloudera is an example. We need to support those users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.