You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Roshan Karki <ro...@nepasoft.com.np> on 2009/11/18 10:54:46 UTC
FW: how to run the mapreduce job from localfilesystem but in hadoop cluster(fully distributed mode)
From: Roshan Karki
Sent: Wednesday, November 18, 2009 3:34 PM
To: 'common-user@hadoop.apache.org'
Subject: how to run the mapreduce job from localfilesystem but in hadoop
cluster(fully distributed mode)
Hi ,
I have setup hadoop in a fully distributed mode(hadoop cluster).
The default file sytem is hdfs.I have two nodes cluster. The map reduce
job works fine when I give inputfile from hdfs location and output is
also generated in hdfs when running WordCount example. My
hadoop-site.xml looks like this:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://server1.example.com</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>server1.example.com:9001</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/admin/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/admin/hadoop/dfs/name</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred/system</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
The problem is that I don't want to give hdfs location for fileinput and
want to generated output in localfilesystem also giving the inputfile
from local but I want to do this in hadoop fully distributed mode
configuration(keeping the hadoop cluster alive).
Giving default filesystem other than hdfs doesn't run cluster.so
Is there any way to achieve this?
I also have nfs server running. All the nodes in hadoop cluster are also
the client of nfs,so they can share the exported /home/admin directory.
Does nfs help in this regard?
Do I have to change my hadoop-site.xml.If so what would it looks like?
Any help will be heartly appreciated.
I am looking forward for the kind response.
Re: FW: how to run the mapreduce job from localfilesystem but in
hadoop cluster(fully distributed mode)
Posted by Steve Loughran <st...@apache.org>.
Roshan Karki wrote:
>
>
>
> The problem is that I don't want to give hdfs location for fileinput and
> want to generated output in localfilesystem also giving the inputfile
> from local but I want to do this in hadoop fully distributed mode
> configuration(keeping the hadoop cluster alive).
>
Either upload the data files to the HDFS filestore before the run , or
use file: URLs and make sure that your files are visible on every task
tracker by mounting the same network drive on the same path, such as
file:/mount/nfs/server1/work/data