You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2007/01/25 15:51:49 UTC

[jira] Created: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Make writes to S3FileSystem world visible only on completion
------------------------------------------------------------

                 Key: HADOOP-931
                 URL: https://issues.apache.org/jira/browse/HADOOP-931
             Project: Hadoop
          Issue Type: Bug
          Components: fs
            Reporter: Tom White


Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).

We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467527 ] 

Andrzej Bialecki  commented on HADOOP-931:
------------------------------------------

While we're at it, it's been often requested by Nutch users that DFS should do an automatic close of a partial file, if the process writing it abruptly exits. Currently partial files are deleted (which often means that even in case where partial files are usable they are deleted anyway).

> Make writes to S3FileSystem world visible only on completion
> ------------------------------------------------------------
>
>                 Key: HADOOP-931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-931
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tom White
>
> Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).
> We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Posted by "Milind Bhandarkar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467521 ] 

Milind Bhandarkar commented on HADOOP-931:
------------------------------------------

I would prefer the DFS doing the right thing, i.e. listing the file being created in listPaths, but not allowing it to be opened for reading while it is being written. 

> Make writes to S3FileSystem world visible only on completion
> ------------------------------------------------------------
>
>                 Key: HADOOP-931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-931
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tom White
>
> Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).
> We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467511 ] 

Doug Cutting commented on HADOOP-931:
-------------------------------------

Hmm.  Seeing partial files could be considered a feature.  LocalFileSystem also makes partial files visitible, no?  Is this breaking something?  If not, I'd leave things as-is.

> Make writes to S3FileSystem world visible only on completion
> ------------------------------------------------------------
>
>                 Key: HADOOP-931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-931
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tom White
>
> Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).
> We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467519 ] 

Tom White commented on HADOOP-931:
----------------------------------

No, it's not breaking anything that I know of. I wanted to make S3FileSystem consistent with DFS, but as you rightly point out LocalFileSystem makes partial files visible.

It would be nice to improve the documentation of FileSystem to make it clearer what the contract permits, this could be combined with creating a set of common unit tests for different implementations. However, this feels like a longer term goal, so I won't pursue it further at the moment.

I'll close this issue.

> Make writes to S3FileSystem world visible only on completion
> ------------------------------------------------------------
>
>                 Key: HADOOP-931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-931
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tom White
>
> Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).
> We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-931) Make writes to S3FileSystem world visible only on completion

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White resolved HADOOP-931.
------------------------------

    Resolution: Won't Fix

Closing since nothing is broken. Any DFS changes should go in a new issue.

> Make writes to S3FileSystem world visible only on completion
> ------------------------------------------------------------
>
>                 Key: HADOOP-931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-931
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>            Reporter: Tom White
>
> Currently files written to S3 are visible to other processes as soon as the first block has been written. This is different to DFS which only makes files world visible after the stream writing to the file has closed (see FSNamesystem.completeFile).
> We could implement this by having a piece of inode metadata that indicates the visibility of the file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.