You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Craig Macdonald (JIRA)" <ji...@apache.org> on 2008/03/19 19:40:24 UTC

[jira] Updated: (HADOOP-4) tool to mount dfs on linux

     [ https://issues.apache.org/jira/browse/HADOOP-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Craig Macdonald updated HADOOP-4:
---------------------------------

    Attachment: fuse_dfs.c

Hi Pete,

Have you had a chance to look at FUSE readaheads? I have attached a version of fuse_dfs.c I have patched, which reads 10MB chunks from DFS, and cache these in the a struct held in the filehandle. 

I'm seeing some improvement (down to 1m 20 compared to "bin/hadoop dfs -cat file > /dev/null" which takes about 50 seconds). Increasing the buffer size shows some improvement [I only did some quick tests]  - I tried up to 30MB, but I dont think there's much improvement over 5-10MB

Do you think we're reaching the limit such that the overheads of JNI are making it impossible to go any faster? Ie Where do we go from here?

Another comment I have is that the configure/makefile asks for a dfs_home. It might be easier to ask for Hadoop home, then build the appropriate paths from there (${hadoop_home}/libhdfs and ${hadoop_home}/src/c++/libhdfs). Hadoop has no include/linux folders etc. Finally, we need a way to detect whether to use i386 or amd64 to find jvm.so

Craig

> tool to mount dfs on linux
> --------------------------
>
>                 Key: HADOOP-4
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.5.0
>         Environment: linux only
>            Reporter: John Xing
>            Assignee: Pete Wyckoff
>         Attachments: fuse-dfs.tar.gz, fuse-dfs.tar.gz, fuse-dfs.tar.gz, fuse-dfs.tar.gz, fuse-dfs.tar.gz, fuse-hadoop-0.1.0_fuse-j.2.2.3_hadoop.0.5.0.tar.gz, fuse-hadoop-0.1.0_fuse-j.2.4_hadoop.0.5.0.tar.gz, fuse-hadoop-0.1.1.tar.gz, fuse-j-hadoopfs-03.tar.gz, fuse_dfs.c, fuse_dfs.c, fuse_dfs.c, fuse_dfs.c, fuse_dfs.c, fuse_dfs.sh, Makefile
>
>
> tool to mount dfs on linux

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.