You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chris Douglas (JIRA)" <ji...@apache.org> on 2010/05/22 03:58:17 UTC

[jira] Updated: (HADOOP-6450) Enhance FSDataOutputStream to allow retrieving the current number of replicas of current block

     [ https://issues.apache.org/jira/browse/HADOOP-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Douglas updated HADOOP-6450:
----------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Won't Fix

Marking as wontfix. Please reopen if required.

> Enhance FSDataOutputStream to allow retrieving the current number of replicas of current block
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6450
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6450
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: Replicable.txt, Replicable.txt
>
>
> The current HDFS implementation has the limitation that it does not replicate the last partial block of a file when it is being written into until the file is closed. There are some long running applications (e.g. HBase) which writes transactions logs into HDFS. If datanode(s) in the write pipeline dies off, the application has no knowledge of it until all the datanode(s) fail and the application gets an IO error.
> These applictions would benefit a lot if they can determine the number of live replicas of the current block to which it is writing data. For example, the application can decide that when one of the datanode in the write pipeline fails it will close the file and start writing to  a new file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.