You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by venkata subbarayudu <av...@gmail.com> on 2009/11/05 13:36:13 UTC
Re: Hadoop : Too many fetch failures -- Reducer doesn't start

> Hi All
> - I have setup a 'single node hadoop' (hadoop-version 0.20.0)  one
> localhost by following the instructions from '
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)<http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29>'
> .  (i.e. master&slave nodes are 'localhost'.)
>
> I was able to run MapReduce tasks with no problems in a standalone system,
> and
> now I am trying to setup the same on a different system, and this system
> has two IP addresses, the new setup looks fine (because all the hadoop
> processes like datanode, namenode, secondarynamenode, tasktracker,
> jobtracker were started). and I'm able to run Hadoop Jobs that have only
> Mapper Tasks. but for Jobs that have Map&Reduce tasks, Hadoop Map Task is
> throwing '*Too Many fetch failures*', exception.
>
>
> Can Somebody please suggest and give any insights on how the problem can be
> resolved.
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------
> I think, the reason for this exception is that, some how the Reducer task
> is not able to read o/p of MapperTask, and this communication happens over a
> dns that was specified by the following properties in hadoop-spec [Please
> see the config file below].
>
>            *hdfs-site.xml*
>
>            <property>
>            <name>dfs.datanode.dns.interface</name>
>                      <value>default</value>
>                      <description>The name of the Network Interface from
> which a data node should  report its IP address. </description>
>            </property>
>
>          <property>
>                  <name>dfs.datanode.dns.nameserver</name>
>                  <value>default</value>
>                  <description>The host name or IP address of the name
> server (DNS)
>                       which a DataNode should use to determine the host
> name used by the
>                       NameNode for communication and display purposes.
>                 </description>
>          </property>
>
>         *mapred-site.xml*
>
>         <property>
>                  <name>mapred.tasktracker.dns.interface</name>
>                  <value>default</value>
>                 <description>The name of the Network Interface from which a
> task tracker should report its IP address. </description>
>         </property>
>
>         <property>
>               <name>mapred.tasktracker.dns.nameserver</name>
>               <value>default</value>
>               <description>The host name or IP address of the name server
> (DNS)
>                           which a TaskTracker should use to determine the
> host name used by
>                           the JobTracker for communication and display
> purposes.
>              </description>
>        </property>
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Our Server has 2 IP addresses and the hadoop-user can access the system
> using 2nd IP. I believe, As the value specified above is "default" -->
> hadoop gets the localhost hostname and uses this(IP1) for further
> communication.As the hadoop-user is not allowed to access the system using
> the IP1, the communication fails, and the Map/Reduce Task is not able to
> report/read the hdfs-data. which caused Hadoop to throw 'Too Many Fetch
> failures' exception.
>
> please correct me If the above explanation is incorrect, quick reply is
> much appreciated.
>
> Thanks,
> Rayudu.
>