You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2007/10/11 01:22:50 UTC

[jira] Issue Comment Edited: (HADOOP-1707) DFS client can allow user to write data to the next block while uploading previous block to HDFS

    [ https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533901 ] 

rangadi edited comment on HADOOP-1707 at 10/10/07 4:21 PM:
----------------------------------------------------------------

The jira description only talks about parallel write to datanodes. It does not require removal of the temp file on client.

How about just storing the block at the client like we do now and replay the data if the there is an error? It still allows parallel write to the client.  This also does not need any changes/improvements to datanode protocol. Yes, removing the temp file would be better, but it is not worse than current implementation. 

      was (Author: rangadi):
    
The jira description only talks about parallel write to datanodes. It does not require removed the temp file on client.

How about just storing the block at the client like we do now and replay the data if the there is an error? It still allows parallel write to the client.  This also does not need any changes/improvements to datanode protocol. Yes, removing the temp file would be better, but it is not worse than current implementation. 
  
> DFS client can allow user to write data to the next block while uploading previous block to HDFS
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1707
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1707
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> The DFS client currently uses a staging file on local disk to cache all user-writes to a file. When the staging file accumulates 1 block worth of data, its contents are flushed to a HDFS datanode. These operations occur sequentially.
> A simple optimization of allowing the user to write to another staging file while simultaneously uploading the contents of the first staging file to HDFS will improve file-upload performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.