You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2010/02/18 01:23:28 UTC

[jira] Created: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Roll Hlog if any datanode in the write pipeline dies
----------------------------------------------------

                 Key: HBASE-2234
                 URL: https://issues.apache.org/jira/browse/HBASE-2234
             Project: Hadoop HBase
          Issue Type: Improvement
          Components: regionserver
            Reporter: dhruba borthakur


HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-2234:
---------------------------------------

    Attachment: HBASE-2234-20.4.patch

Here is a patch with associated unit test.  I was trying to figure out whether to test at the HLog or the HRegionServer level.  I wrote the same tests at both levels, but I submitted the HRegionServer one.  Let me know if you need the other.  Note that this also includes our group-commit code and syncFs() modifications, so it won't work straight off the trunk.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-2234:
---------------------------------------

    Attachment: HBASE-2234-20.4-1.patch

...and then I clicked the wrong button on the Attachment License

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-2234:
---------------------------------------

    Attachment: HBASE-2234-20.4-1.patch

Updates to address comments on from this jira & internal review.  Notable changes:
1. added checks to ensure clients with HDFS-826 or append support would not be negatively affected
2. simplified rebindWriterFunc(). Only happens inside rollWriter()
3. asserts in TestLogRolling to discover the presence of HDFS-826 & append support.  You can easily change the asserts to LOG.info() depending upon your default HDFS jar distro.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839499#action_12839499 ] 

ryan rawson commented on HBASE-2234:
------------------------------------

would it be necessary or possible to move this function to a static context:
+  private void rebindWriterFunc(SequenceFile.Writer writer) throws IOException {

I guess since there is only one hlog instance per RS, maybe it's ok?

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-2234:
----------------------------

    Assignee: stack

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg updated HBASE-2234:
---------------------------------------

    Attachment:     (was: HBASE-2234-20.4-1.patch)

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-2234:
----------------------------

    Assignee: Nicolas Spiegelberg  (was: stack)

Assigning Nicolas since he's doing the work (Made N a contributor).

@Nicolas, a few comments and solicitation of opinon.

+ We need to update the hadoop we bundle.  We'll want to ship with hadoop 0.20.2.  It has at a minimum hdfs-127 fix.  We should probably apply hdfs-826 to the hadoop we ship too since its a client-side only change.  If we included hdfs-200, that'd make it so this test you've included actually gets exercised so we should apply it too?

+ In fact, it looks like this test fails if 826 and 200 are not in place, is that right?  You probably don't want that.  Maybe skip out the test if they are not in place but don't fail I'd say.

+ Your test is great.

+ FYI, we try not to reference log4j explicitly. -- i.e. the logger implementation -- but i think in this case you have no choice going by the commons dictum that the logger config. is outside of its scope (I was reading under "Configuring the Underlying Logging System" in http://commons.apache.org/logging/apidocs/org/apache/commons/logging/package-summary.html).

+ I like the comments you've added to HLog.java

+ The log message says hadoop-4379 if hdfs-200 is found... maybe add or change mention of hdfs-200

Patch looks good otherwise.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839241#action_12839241 ] 

stack commented on HBASE-2234:
------------------------------

@ Nicolas -- Sweet.  Patch looks good.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-2234.
--------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.21.0)
     Hadoop Flags: [Reviewed]

Resolving.  Thanks for the patch Nicolas.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: Nicolas Spiegelberg
>            Priority: Blocker
>             Fix For: 0.20.4
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-2234:
-------------------------

    Priority: Blocker  (was: Major)

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: Nicolas Spiegelberg
>            Priority: Blocker
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4-1.patch, HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "Nicolas Spiegelberg (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839522#action_12839522 ] 

Nicolas Spiegelberg commented on HBASE-2234:
--------------------------------------------

@ryan I could go a couple ways with rebindWriterFunc().  I either need the "this" pointer to set "this.hdfs_out", in which case I don't need to explicitly pass in any params since it's always "this.writer".  Or I could return an OutputStream and make this function static.  I'm new to Java from an embedded C++ background; but to me it look like this is 6 of one, half-dozen of the other.  I have a couple internal comments about this diff I may need to apply as well, so I'll roll that change into my next diff.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2234-20.4.patch
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2234) Roll Hlog if any datanode in the write pipeline dies

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-2234:
-------------------------

    Fix Version/s: 0.21.0
                   0.20.4

Adding into 0.20.4 and 0.21.  Also assigned it to myself.  Take it back if you want to do it Dhruba but I figure you have enough going on hdfs-side.

> Roll Hlog if any datanode in the write pipeline dies
> ----------------------------------------------------
>
>                 Key: HBASE-2234
>                 URL: https://issues.apache.org/jira/browse/HBASE-2234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: dhruba borthakur
>            Assignee: stack
>             Fix For: 0.20.4, 0.21.0
>
>
> HDFS does not replicate the last block of a file that is being written to. This means that is datanodes in the write pipeline die, then the data blocks in the transaction log would be experiencing reduced redundancy. It would be good if the region server can detect datanode-death in the write pipeline while writing to the transaction log and if this happens, close the current log an open a new one. This depends on HDFS-826

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.