You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by zs...@apache.org on 2008/09/16 01:47:31 UTC

svn commit: r695689 - /hadoop/core/trunk/src/contrib/fuse-dfs/README

Author: zshao
Date: Mon Sep 15 16:47:31 2008
New Revision: 695689

URL: http://svn.apache.org/viewvc?rev=695689&view=rev
Log:
HADOOP-4076.  fuse-dfs REAME updated.

Modified:
    hadoop/core/trunk/src/contrib/fuse-dfs/README

Modified: hadoop/core/trunk/src/contrib/fuse-dfs/README
URL: http://svn.apache.org/viewvc/hadoop/core/trunk/src/contrib/fuse-dfs/README?rev=695689&r1=695688&r2=695689&view=diff
==============================================================================
--- hadoop/core/trunk/src/contrib/fuse-dfs/README (original)
+++ hadoop/core/trunk/src/contrib/fuse-dfs/README Mon Sep 15 16:47:31 2008
@@ -8,71 +8,64 @@
 # implied.  See the License for the specific language governing
 # permissions and limitations under the License.
 
-This is a FUSE module for Hadoop's HDFS.
+Fuse-DFS
 
-It allows one to mount HDFS as a Unix filesystem and optionally export
-that mount point to other machines.
+Supports reads, writes, and directory operations (e.g., cp, ls, more, cat, find, less, rm, mkdir, mv, rmdir).  Things like touch, chmod, chown, and permissions are in the works. Fuse-dfs currently shows all files as owned by nobody.
 
-cp, write, rmdir, mv, mkdir, rm are all supported. But permissions are not.
+Contributing
 
-BUILDING:
+It's pretty straightforward to add functionality to fuse-dfs as fuse makes things relatively simple. Some other tasks require also augmenting libhdfs to expose more hdfs functionality to C. See [http://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&mode=hide&pid=12310240&sorter/order=DESC&sorter/field=priority&resolution=-1&component=12312376  contrib/fuse-dfs JIRAs]
 
-Requirements:
+Requirements
 
-   1. a Linux kernel > 2.6.9 or a kernel module from FUSE - i.e., you
-   compile it yourself and then modprobe it. Better off with the
-   former option if possible.  (Note for now if you use the kernel
-   with fuse included, it doesn't allow you to export this through NFS
-   so be warned. See the FUSE email list for more about this.)
+ * Hadoop with compiled libhdfs.so
+ * Linux kernel > 2.6.9 with fuse, which is the default or Fuse 2.7.x, 2.8.x installed. See: [http://fuse.sourceforge.net/]
+ * modprobe fuse to load it
+ * fuse-dfs executable (see below)
+ * fuse_dfs_wrapper.sh installed in /bin or other appropriate location (see below)
 
-   2. FUSE should be installed in /usr/local or FUSE_HOME ant
-   environment variable
 
-To build:
-
-   1. in HADOOP_HOME: ant compile-contrib -Dcompile.c++=1 -Dfusedfs=1
+BUILDING
 
+   1. in HADOOP_HOME: `ant compile-libhdfs -Dlibhdfs=1
+   2. in HADOOP_HOME: `ant package` to deploy libhdfs
+   3. in HADOOP_HOME: `ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1`
 
 NOTE: for amd64 architecture, libhdfs will not compile unless you edit
 the Makefile in src/c++/libhdfs/Makefile and set OS_ARCH=amd64
-(probably the same for others too).
+(probably the same for others too). See [https://issues.apache.org/jira/browse/HADOOP-3344 HADOOP-3344]
+
+Common build problems include not finding the libjvm.so in JAVA_HOME/jre/lib/OS_ARCH/server or not finding fuse in FUSE_HOME or /usr/local.
 
---------------------------------------------------------------------------------
 
-CONFIGURING:
+CONFIGURING
 
-Look at all the paths in fuse_dfs_wrapper.sh and either correct them
-or set them in your environment before running. (note for automount
-and mount as root, you probably cannnot control the environment, so
-best to set them in the wrapper)
+Look at all the paths in fuse_dfs_wrapper.sh and either correct them or set them in your environment before running. (note for automount and mount as root, you probably cannot control the environment, so best to set them in the wrapper)
 
-INSTALLING:
+INSTALLING
 
-1. mkdir /mnt/dfs (or wherever you want to mount it)
+1. `mkdir /export/hdfs` (or wherever you want to mount it)
 
-2. fuse_dfs_wrapper.sh dfs://hadoop_server1.foo.com:9000 /mnt/dfs -d
-; and from another terminal, try ls /mnt/dfs
+2. `fuse_dfs_wrapper.sh dfs://hadoop_server1.foo.com:9000 /export/hdfs -d` and from another terminal, try `ls /export/hdfs`
 
 If 2 works, try again dropping the debug mode, i.e., -d
 
-(note - common problems are that you don't have libhdfs.so or
-libjvm.so or libfuse.so on your LD_LIBRARY_PATH, and your CLASSPATH
-does not contain hadoop and other required jars.)
+(note - common problems are that you don't have libhdfs.so or libjvm.so or libfuse.so on your LD_LIBRARY_PATH, and your CLASSPATH does not contain hadoop and other required jars.)
 
---------------------------------------------------------------------------------
+Also note, fuse-dfs will write error/warn messages to the syslog - typically in /var/log/messages
 
+You can use fuse-dfs to mount multiple hdfs instances by just changing the server/port name and directory mount point above.
 
-DEPLOYING:
+DEPLOYING
 
 in a root shell do the following:
 
-1. add the following to /etc/fstab -
-  fuse_dfs#dfs://hadoop_server.foo.com:9000 /mnt/dfs fuse
-  -oallow_other,rw,-ousetrash 0 0
-
-2. mount /mnt/dfs Expect problems with not finding fuse_dfs. You will
-   need to probably add this to /sbin and then problems finding the
-   above 3 libraries. Add these using ldconfig.
+1. add the following to /etc/fstab
+
+fuse_dfs#dfs://hadoop_server.foo.com:9000 /export/hdfs fuse -oallow_other,rw,-ousetrash 0 0
+
+
+2. Mount using: `mount /export/hdfs`. Expect problems with not finding fuse_dfs. You will need to probably add this to /sbin and then problems finding the above 3 libraries. Add these using ldconfig.
 
 
 Fuse DFS takes the following mount options (i.e., on the command line or the comma separated list of options in /etc/fstab:
@@ -100,40 +93,32 @@
 notrash
 private = 0
 
---------------------------------------------------------------------------------
-
-
-EXPORTING:
+EXPORTING
 
 Add the following to /etc/exports:
 
-  /mnt/hdfs *.foo.com(no_root_squash,rw,fsid=1,sync)
+/export/hdfs *.foo.com(no_root_squash,rw,fsid=1,sync)
 
 NOTE - you cannot export this with a FUSE module built into the kernel
 - e.g., kernel 2.6.17. For info on this, refer to the FUSE wiki.
---------------------------------------------------------------------------------
 
-ADVANCED:
 
-you may want to ensure certain directories cannot be deleted from the
-shell until the FS has permissions. You can set this in the build.xml
-file in src/contrib/fuse-dfs/build.xml
+RECOMMENDATIONS
 
---------------------------------------------------------------------------------
+1. From /bin, `ln -s $HADOOP_HOME/contrib/fuse-dfs/fuse_dfs* .`
 
-RECOMMENDATIONS:
-
-1. From /bin, ln -s $HADOOP_HOME/contrib/fuse-dfs/fuse_dfs* .
 2. Always start with debug on so you can see if you are missing a classpath or something like that.
+
 3. use -obig_writes
 
---------------------------------------------------------------------------------
 
-PERFORMANCE:
+KNOWN ISSUES 
 
-1. if you alias ls to ls --color=auto and try listing a directory with lots (over thousands) of files, expect it to be slow and at 10s of thousands, expect it to be very very slow.  This is because --color=auto causes ls to stat every file in the directory. Since fuse-dfs does not cache attribute entries when doing a readdir, this is very slow. see https://issues.apache.org/jira/browse/HADOOP-3797 
+1. if you alias `ls` to `ls --color=auto` and try listing a directory with lots (over thousands) of files, expect it to be slow and at 10s of thousands, expect it to be very very slow.  This is because `--color=auto` causes ls to stat every file in the directory. Since fuse-dfs does not cache attribute entries when doing a readdir, 
+this is very slow. see [https://issues.apache.org/jira/browse/HADOOP-3797 HADOOP-3797]
 
-2. Writes are approximately 33% slower than the DFSClient. TBD how to optimize this. see: https://issues.apache.org/jira/browse/HADOOP-3805 - try using -obig_writes and if on a >2.6.26 kernel, should perform much better since bigger writes implies less context switching.
+2. Writes are approximately 33% slower than the DFSClient. TBD how to optimize this. see: [https://issues.apache.org/jira/browse/HADOOP-3805 HADOOP-3805] - try using -obig_writes if on a >2.6.26 kernel, should perform much better since bigger writes implies less context switching.
 
 3. Reads are ~20-30% slower even with the read buffering. 
 
+4. fuse-dfs and underlying libhdfs have no support for permissions. See [https://issues.apache.org/jira/browse/HADOOP-3536 HADOOP-3536]