You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by rab ra <ra...@gmail.com> on 2013/08/21 14:21:10 UTC
running map task in remote node
Hello,
Here is the new bie question of the day.
For one of my use cases, I want to use hadoop map reduce without HDFS.
Here, I will have a text file containing a list of file names to process.
Assume that I have 10 lines (10 files to process) in the input text file
and I wish to generate 10 map tasks and execute them in parallel in 10
nodes. I started with basic tutorial on hadoop and could setup single node
hadoop cluster and successfully tested wordcount code.
Now, I took two machines A (master) and B (slave). I did the below
configuration in these machines to setup a two node cluster.
hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/tmp/hadoop-bala/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/tmp/hadoop-bala/dfs/data</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>A:9001</value>
</property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>A:9001</value>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>1</value>
</property>
</configuration>
core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://A:9000</value>
</property>
</configuration>
In A and B, I do have a file named ‘slaves’ with an entry ‘B’ in it and
another file called ‘masters’ wherein an entry ‘A’ is there.
I have kept my input file at A. I see the map method process the input file
line by line but they are all processed in A. Ideally, I would expect those
processing to take place in B.
Can anyone highlight where I am going wrong?
regards
rab