You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Geoffry Roberts <ge...@gmail.com> on 2009/09/10 01:07:06 UTC

hadoop 0.20.0 jobtracker.info could only be replicated to 0 nodes

All,

If I'm on the wrong list, please redirect me.

I'm setting up my first hadoop full cluster.  I did the cygwin thing and
everything works.  I'm having problems with the cluster.

The cluster is five nodes of matched hardware running Ubuntu 8.04.  I
believe I have ssh working properly. The master node is named hbase1, but
I'm not doing anything with hbase.

I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
error.  The data nodes, however, appear to be complaining.  "Retrying
connect to server: hbase1"

I run start-mapred.sh, Jps shows NameNode and JobTracker running.

The namenode log says, "jobtracker.info could only be replicated to 0 nodes,
instead of 1".

The jobtracker log says two things of significance:

1. "It might be because the JobTracker failed to read/write system files
(hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
system  file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
missing!"

How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.

2.  "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"

How can this be in use by anything other that JobTracker?  I have nothing
else on the machine that could be conflicting.  I have also verified that I
have no previous hadoop processes hanging around.

I have included my config files just in case.

Many thanks for any help.

****************
core-site.xml

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://hbase1:30000</value>
  </property>
</configuration>

*****************
hdfs-site.xml

<configuration>
  <property>
    <name>dfs.name.dir</name>
    <value>/hdfs/name1</value>
  </property>
  <property>
    <name>dfs.data.dir</name>
    <value>/hdfs/data1</value>
  </property>
  <property>
   <name>fs.checkpoint.dir</name>
   <value>/hdfs/check1</value>
 </property>
 <property>
    <name>dfs.replication</name>
    <value>5</value>
 </property>
</configuration>

***********************
mapred-site.xml

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>hbase1:30001</value>
  </property>
  <property>
    <name>mapred.system.dir</name>
    <value>/hdfs/mapred/system</value>
  </property>
  <property>
    <name>mapred.local.dir</name>
    <value>/hdfs/mapred/local</value>
  </property>
  <property>
    <name>mapred.tasktracker.map.tasks</name>
    <value>50</value>
  </property>
  <property>
    <name>mapred.tasktracker.reduce.tasks</name>
    <value>35</value>
  </property>
</configuration>

Re: hadoop 0.20.0 jobtracker.info could only be replicated to 0 nodes

Posted by Bill Yu <yu...@gmail.com>.

You can try delete the directories of the filesystem and temporary , and
then format namenode
2009/9/10 喻旭日 <xu...@gmail.com>

> I have got the same wrong.I try to format namenode again ,and modify the
> port of <mapred.job.tracker>,but wrong is still.
>
>
>
> 2009/9/10 Geoffry Roberts <ge...@gmail.com>
>
> All,
>>
>> If I'm on the wrong list, please redirect me.
>>
>> I'm setting up my first hadoop full cluster.  I did the cygwin thing and
>> everything works.  I'm having problems with the cluster.
>>
>> The cluster is five nodes of matched hardware running Ubuntu 8.04.  I
>> believe I have ssh working properly. The master node is named hbase1, but
>> I'm not doing anything with hbase.
>>
>> I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
>> error.  The data nodes, however, appear to be complaining.  "Retrying
>> connect to server: hbase1"
>>
>> I run start-mapred.sh, Jps shows NameNode and JobTracker running.
>>
>> The namenode log says, "jobtracker.info could only be replicated to 0
>> nodes, instead of 1".
>>
>> The jobtracker log says two things of significance:
>>
>> 1. "It might be because the JobTracker failed to read/write system files
>> (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
>> hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
>> system  file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
>> missing!"
>>
>> How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.
>>
>> 2.  "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"
>>
>> How can this be in use by anything other that JobTracker?  I have nothing
>> else on the machine that could be conflicting.  I have also verified that I
>> have no previous hadoop processes hanging around.
>>
>> I have included my config files just in case.
>>
>> Many thanks for any help.
>>
>> ****************
>> core-site.xml
>>
>> <configuration>
>>   <property>
>>     <name>fs.default.name</name>
>>     <value>hdfs://hbase1:30000</value>
>>   </property>
>> </configuration>
>>
>> *****************
>> hdfs-site.xml
>>
>> <configuration>
>>   <property>
>>     <name>dfs.name.dir</name>
>>     <value>/hdfs/name1</value>
>>   </property>
>>   <property>
>>     <name>dfs.data.dir</name>
>>     <value>/hdfs/data1</value>
>>   </property>
>>   <property>
>>    <name>fs.checkpoint.dir</name>
>>    <value>/hdfs/check1</value>
>>  </property>
>>  <property>
>>     <name>dfs.replication</name>
>>     <value>5</value>
>>  </property>
>> </configuration>
>>
>> ***********************
>> mapred-site.xml
>>
>> <configuration>
>>   <property>
>>     <name>mapred.job.tracker</name>
>>     <value>hbase1:30001</value>
>>   </property>
>>   <property>
>>     <name>mapred.system.dir</name>
>>     <value>/hdfs/mapred/system</value>
>>   </property>
>>   <property>
>>     <name>mapred.local.dir</name>
>>     <value>/hdfs/mapred/local</value>
>>   </property>
>>   <property>
>>     <name>mapred.tasktracker.map.tasks</name>
>>     <value>50</value>
>>   </property>
>>   <property>
>>     <name>mapred.tasktracker.reduce.tasks</name>
>>     <value>35</value>
>>   </property>
>> </configuration>
>>
>
>
>
> --
> Best Regards,
> Michael Yu
>
>

Re: hadoop 0.20.0 jobtracker.info could only be replicated to 0 nodes

Posted by 喻旭日 <xu...@gmail.com>.

I have got the same wrong.I try to format namenode again ,and modify the
port of <mapred.job.tracker>,but wrong is still.



2009/9/10 Geoffry Roberts <ge...@gmail.com>

> All,
>
> If I'm on the wrong list, please redirect me.
>
> I'm setting up my first hadoop full cluster.  I did the cygwin thing and
> everything works.  I'm having problems with the cluster.
>
> The cluster is five nodes of matched hardware running Ubuntu 8.04.  I
> believe I have ssh working properly. The master node is named hbase1, but
> I'm not doing anything with hbase.
>
> I run start-dfs.sh, Jps shows NameNode running, and the logs are free of
> error.  The data nodes, however, appear to be complaining.  "Retrying
> connect to server: hbase1"
>
> I run start-mapred.sh, Jps shows NameNode and JobTracker running.
>
> The namenode log says, "jobtracker.info could only be replicated to 0
> nodes, instead of 1".
>
> The jobtracker log says two things of significance:
>
> 1. "It might be because the JobTracker failed to read/write system files
> (hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info /
> hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info.recover) or the
> system  file hdfs://hbase1:30000/hdfs/mapred/system/jobtracker.info is
> missing!"
>
> How can this be? If I do bin/hadoop fs -lsr / , jobtracker.info shows.
>
> 2.  "Problem binding to hbase1/127.0.1.1:30001 : Address already in use"
>
> How can this be in use by anything other that JobTracker?  I have nothing
> else on the machine that could be conflicting.  I have also verified that I
> have no previous hadoop processes hanging around.
>
> I have included my config files just in case.
>
> Many thanks for any help.
>
> ****************
> core-site.xml
>
> <configuration>
>   <property>
>     <name>fs.default.name</name>
>     <value>hdfs://hbase1:30000</value>
>   </property>
> </configuration>
>
> *****************
> hdfs-site.xml
>
> <configuration>
>   <property>
>     <name>dfs.name.dir</name>
>     <value>/hdfs/name1</value>
>   </property>
>   <property>
>     <name>dfs.data.dir</name>
>     <value>/hdfs/data1</value>
>   </property>
>   <property>
>    <name>fs.checkpoint.dir</name>
>    <value>/hdfs/check1</value>
>  </property>
>  <property>
>     <name>dfs.replication</name>
>     <value>5</value>
>  </property>
> </configuration>
>
> ***********************
> mapred-site.xml
>
> <configuration>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>hbase1:30001</value>
>   </property>
>   <property>
>     <name>mapred.system.dir</name>
>     <value>/hdfs/mapred/system</value>
>   </property>
>   <property>
>     <name>mapred.local.dir</name>
>     <value>/hdfs/mapred/local</value>
>   </property>
>   <property>
>     <name>mapred.tasktracker.map.tasks</name>
>     <value>50</value>
>   </property>
>   <property>
>     <name>mapred.tasktracker.reduce.tasks</name>
>     <value>35</value>
>   </property>
> </configuration>
>



-- 
Best Regards,
Michael Yu