You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Ahad Rana (JIRA)" <ji...@apache.org> on 2007/09/01 02:19:18 UTC

[jira] Commented: (HADOOP-1783) keyToPath in Jets3tFileSystemStore needs to return absolute path

    [ https://issues.apache.org/jira/browse/HADOOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524238 ] 

Ahad Rana commented on HADOOP-1783:
-----------------------------------

Hi Tom,

I will try to produce some stack traces for you. But, ultimately, if you look at the DistributedFileSystem implementation of listPaths, it clearly creates fully qualified paths using the DfsPath(DFSFileInfo,FileSystem) constructor. In the case of the s3 implementation, the listPaths, as I mentioned, returns sub-paths without the scheme or the bucket name (authorization). If the default file system is not s3, then the hadoop library returns improper results by trying to resolve the returned sub-path against the default FileSystem ( since the scheme is missing from the path object).

I am working on enabling map-reduce functionality for scenarios where either both, or at least one file specification (map input, and reduce output)  in a map reduce spec points to the s3 file system. The above mentioned bug breaks the code in a couple of different places. When I implement keytoPath in Jet3FileSystemStore as follows, everything works. 

private Path keyToPath(String key) {
    return new Path("s3://"+bucket.getName()+key);
  }

Suffice it to say, there are other (performance related) issues that I am also looking at in order to enable satisfactory use of s3 as a potential input/output for a mapreduce job. But, by far, this bug is the most critically broken issue. 

Sorry about the lack of stack traces. I just need to recreate a proper test environment to get you these, and hopefully I will be able to submit something to you next week. 

Thanks,

Ahad.

> keyToPath in Jets3tFileSystemStore needs to return absolute path
> ----------------------------------------------------------------
>
>                 Key: HADOOP-1783
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1783
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.1.0, 0.1.1, 0.2.0, 0.2.1, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0, 0.6.0, 0.6.1, 0.6.2, 0.7.0, 0.7.1, 0.7.2, 0.8.0, 0.9.0, 0.9.1, 0.9.2, 0.10.0, 0.10.1, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.12.1, 0.12.2, 0.12.3, 0.13.0, 0.13.1, 0.14.0
>         Environment: hadoop 0.14.0 running under ec2 with s3 filesystem
>            Reporter: Ahad Rana
>
> The keyToPath method probably needs to:
> 1. take the bucket identifier as a parameter.
> 2. set the returned Path object's protocol plus authority (bucket). Currently, APIs such as <i>listSubPaths</i> return relative paths (for a directory listing). This in turn breaks map reduce operations if the default file system is set to be something other than S3 (via fs.default.name, for example). 
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.