You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by venkata subbarayudu <av...@gmail.com> on 2009/11/05 13:27:57 UTC

Hadoop : Too many fetch failures -- Reducer doesn't start

Hi All
- I have setup a 'single node hadoop' (hadoop-version 0.20.0)  one localhost
by following the instructions from '
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)'
.  (i.e. master&slave nodes are 'localhost'.)

I was able to run MapReduce tasks with no problems in a standalone system,
and
now I am trying to setup the same on a different system, and this system has
two IP addresses, the new setup looks fine (because all the hadoop processes
like datanode, namenode, secondarynamenode, tasktracker, jobtracker were
started). and I'm able to run Hadoop Jobs that have only Mapper Tasks. but
for Jobs that have Map&Reduce tasks, Hadoop Map Task is throwing '*Too Many
fetch failures*', exception.


Can Somebody please suggest and give any insights on how the problem can be
resolved.
---------------------------------------------------------------------------------------------------------------------------------------------------
I think, the reason for this exception is that, some how the Reducer task is
not able to read o/p of MapperTask, and this communication happens over a
dns that was specified by the following properties in hadoop-spec [Please
see the config file below].

           *hdfs-site.xml*

           <property>
           <name>dfs.datanode.dns.interface</name>
                     <value>default</value>
                     <description>The name of the Network Interface from
which a data node should  report its IP address. </description>
           </property>

         <property>
                 <name>dfs.datanode.dns.nameserver</name>
                 <value>default</value>
                 <description>The host name or IP address of the name server
(DNS)
                      which a DataNode should use to determine the host name
used by the
                      NameNode for communication and display purposes.
                </description>
         </property>

        *mapred-site.xml*

        <property>
                 <name>mapred.tasktracker.dns.interface</name>
                 <value>default</value>
                <description>The name of the Network Interface from which a
task tracker should report its IP address. </description>
        </property>

        <property>
              <name>mapred.tasktracker.dns.nameserver</name>
              <value>default</value>
              <description>The host name or IP address of the name server
(DNS)
                          which a TaskTracker should use to determine the
host name used by
                          the JobTracker for communication and display
purposes.
             </description>
       </property>
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Our Server has 2 IP addresses and the hadoop-user can access the system
using 2nd IP. I believe, As the value specified above is "default" -->
hadoop gets the localhost hostname and uses this(IP1) for further
communication.As the hadoop-user is not allowed to access the system using
the IP1, the communication fails, and the Map/Reduce Task is not able to
report/read the hdfs-data. which caused Hadoop to throw 'Too Many Fetch
failures' exception.

please correct me If the above explanation is incorrect, quick reply is much
appreciated.

Thanks,
Rayudu.

Re: Hadoop : Too many fetch failures -- Reducer doesn't start

Posted by venkata subbarayudu <av...@gmail.com>.
> Hi All
> - I have setup a 'single node hadoop' (hadoop-version 0.20.0)  one
> localhost by following the instructions from '
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)<http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29>'
> .  (i.e. master&slave nodes are 'localhost'.)
>
> I was able to run MapReduce tasks with no problems in a standalone system,
> and
> now I am trying to setup the same on a different system, and this system
> has two IP addresses, the new setup looks fine (because all the hadoop
> processes like datanode, namenode, secondarynamenode, tasktracker,
> jobtracker were started). and I'm able to run Hadoop Jobs that have only
> Mapper Tasks. but for Jobs that have Map&Reduce tasks, Hadoop Map Task is
> throwing '*Too Many fetch failures*', exception.
>
>
> Can Somebody please suggest and give any insights on how the problem can be
> resolved.
>
> ---------------------------------------------------------------------------------------------------------------------------------------------------
> I think, the reason for this exception is that, some how the Reducer task
> is not able to read o/p of MapperTask, and this communication happens over a
> dns that was specified by the following properties in hadoop-spec [Please
> see the config file below].
>
>            *hdfs-site.xml*
>
>            <property>
>            <name>dfs.datanode.dns.interface</name>
>                      <value>default</value>
>                      <description>The name of the Network Interface from
> which a data node should  report its IP address. </description>
>            </property>
>
>          <property>
>                  <name>dfs.datanode.dns.nameserver</name>
>                  <value>default</value>
>                  <description>The host name or IP address of the name
> server (DNS)
>                       which a DataNode should use to determine the host
> name used by the
>                       NameNode for communication and display purposes.
>                 </description>
>          </property>
>
>         *mapred-site.xml*
>
>         <property>
>                  <name>mapred.tasktracker.dns.interface</name>
>                  <value>default</value>
>                 <description>The name of the Network Interface from which a
> task tracker should report its IP address. </description>
>         </property>
>
>         <property>
>               <name>mapred.tasktracker.dns.nameserver</name>
>               <value>default</value>
>               <description>The host name or IP address of the name server
> (DNS)
>                           which a TaskTracker should use to determine the
> host name used by
>                           the JobTracker for communication and display
> purposes.
>              </description>
>        </property>
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Our Server has 2 IP addresses and the hadoop-user can access the system
> using 2nd IP. I believe, As the value specified above is "default" -->
> hadoop gets the localhost hostname and uses this(IP1) for further
> communication.As the hadoop-user is not allowed to access the system using
> the IP1, the communication fails, and the Map/Reduce Task is not able to
> report/read the hdfs-data. which caused Hadoop to throw 'Too Many Fetch
> failures' exception.
>
> please correct me If the above explanation is incorrect, quick reply is
> much appreciated.
>
> Thanks,
> Rayudu.
>