You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2007/10/19 18:54:51 UTC

[jira] Commented: (HADOOP-1700) Append to files in HDFS

    [ https://issues.apache.org/jira/browse/HADOOP-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12536263 ] 

Doug Cutting commented on HADOOP-1700:
--------------------------------------

"The Client notices if any Datanodes in the pipeline encountered an error [... ]".

We need a bit more discussion of how this will work.  The typical error will not be a nice message sent back up the pipeline, but rather will be a timeout, breaking the pipeline.  We must be careful so that timeouts do not cascade, perhaps by decreasing the timeout for each stage added to the pipeline.  Removing a node from the pipeline then means building a new pipeline from at least the node before the fracture onwards.  Is this the sort of mechanism you have in mind?


> Append to files in HDFS
> -----------------------
>
>                 Key: HADOOP-1700
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1700
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: stack
>         Attachments: Appends-1.xhtml, Appends.doc, Appends.htm
>
>
> Request for being able to append to files in HDFS has been raised a couple of times on the list of late.   For one example, see http://www.nabble.com/HDFS%2C-appending-writes-status-tf3848237.html#a10916193.  Other mail describes folks' workarounds because this feature is lacking: e.g. http://www.nabble.com/Loading-data-into-HDFS-tf4200003.html#a12039480 (Later on this thread, Jim Kellerman re-raises the HBase need of this feature).  HADOOP-337 'DFS files should be appendable' makes mention of file append but it was opened early in the life of HDFS when the focus was more on implementing the basics rather than adding new features.  Interest fizzled.  Because HADOOP-337 is also a bit of a grab-bag -- it includes truncation and being able to concurrently read/write -- rather than try and breathe new life into HADOOP-337, instead, here is a new issue focused on file append.  Ultimately, being able to do as the google GFS paper describes -- having multiple concurrent clients making 'Atomic Record Append' to a single file would be sweet but at least for a first cut at this feature, IMO, a single client appending to a single HDFS file letting the application manage the access would be sufficent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.