You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Rohit Pandey <ro...@gmail.com> on 2012/05/27 19:21:05 UTC

Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Hello Hadoop community,

I have been trying to set up a double node Hadoop cluster (following
the instructions in -
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/)
and am very close to running it apart from one small glitch - when I
start the dfs (using start-dfs.sh), it says:

10.63.88.53: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
10.63.88.109: starting datanode, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
10.63.88.109: starting secondarynamenode, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
starting jobtracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
10.63.88.109: starting tasktracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
10.63.88.53: starting tasktracker, logging to
/usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out

which looks like it's been successful in starting all the nodes.
However, when I check them out by running 'jps', this is what I see:
27531 SecondaryNameNode
27879 Jps

As you can see, there is no datanode and name node. I have been
racking my brains at this for quite a while now. Checked all the
inputs and every thing. Any one know what the problem might be?

-- 

Thanks in advance,

Rohit

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Posted by sandeep <sa...@gmail.com>.

Can you see logs for nn and dn 

Sent from my iPhone

On May 27, 2012, at 1:21 PM, Rohit Pandey <ro...@gmail.com> wrote:

> Hello Hadoop community,
> 
> I have been trying to set up a double node Hadoop cluster (following
> the instructions in -
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/)
> and am very close to running it apart from one small glitch - when I
> start the dfs (using start-dfs.sh), it says:
> 
> 10.63.88.53: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
> 10.63.88.109: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
> 10.63.88.109: starting secondarynamenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
> starting jobtracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
> 10.63.88.109: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
> 10.63.88.53: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out
> 
> which looks like it's been successful in starting all the nodes.
> However, when I check them out by running 'jps', this is what I see:
> 27531 SecondaryNameNode
> 27879 Jps
> 
> As you can see, there is no datanode and name node. I have been
> racking my brains at this for quite a while now. Checked all the
> inputs and every thing. Any one know what the problem might be?
> 
> -- 
> 
> Thanks in advance,
> 
> Rohit

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Posted by samir das mohapatra <sa...@gmail.com>.

*Step wise Details (Ubantu 10.x version ): Go through properly and Run one
by one. it will sove your problem (You can change the path,IP ,Host name as
you like to do)*
---------------------------------------------------------------------------------------------------------
1. Start the terminal

2. Disable ipv6 on all machines
        pico /etc/sysctl.conf 10. Download and install hadoop:

3. Add these files to the EOF cd /usr/local/hadoop
net.ipv6.conf.all.disable_ipv6 = 1 sudo wget –c
http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz
net.ipv6.conf.default.disable_ipv6 = 1 11. Unzip the tar
net.ipv6.conf.lo.disable_ipv6 = 1 sudo tar -zxvf
/usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz
net.ipv6.conf.lo.disable_ipv6 = 1 12. Change permissions on hadoop folder
by granting all to hadoop

3. Reboot the system sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo reboot sudo chmod 750 -R /usr/local/hadoop

4. Install java 13. Create the HDFS directory
sudo apt-get install openjdk-6-jdk openjdk-6-jre sudo mkdir
hadoop-datastore // inside the usr local hadoop folder

5. Check if ssh is installed, if not do so: sudo mkdir
hadoop-datastore/hadoop-hadoop
sudo apt-get install openssh-server openssh-client 14. Add the binaries
path and hadoop home in the environment file

6. Create a group and user called hadoop sudo pico /etc/environment
sudo addgroup hadoop set the bin path as well as hadoop home path
sudo adduser --ingroup hadoop hadoop source /etc/environment

7. Assign all the permissions to the Hadoop user 15. Configure the hadoop
env.sh file
sudo visudo cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/
Add the following line in the file sudo pico conf/hadoop-env.sh
hadoop ALL =(ALL) ALL add the following line in there:

8. Check if hadoop user has ssh installed export
HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
su hadoop export JAVA_HOME="/usr/lib/jvm/java-6-openjdk
ssh-keygen -t rsa -P "" <next page>
Press Enter when asked.
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
Copy the servers RSA public key from server to all nodes
in the authorized_keys file as shown in the above step

9. Make hadoop installation directory:
sudo mkdir /usr/local/


10. Download and install hadoop:
cd /usr/local/hadoop
sudo wget –c http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u2.tar.gz


11. Unzip the tar
sudo tar -zxvf /usr/local/hadoop/hadoop-0.20.2-chd3u2.tar.gz

12. Change permissions on hadoop folder by granting all to hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop
sudo chmod 750 -R /usr/local/hadoop

13. Create the HDFS directory
sudo mkdir hadoop-datastore // inside the usr local hadoop folder
sudo mkdir hadoop-datastore/hadoop-hadoop

14. Add the binaries path and hadoop home in the environment file
  sudo pico /etc/environment
 // set the bin path as well as hadoop home path
source /etc/environment

15. Configure the hadoop env.sh file

cd /usr/local/hadoop/hadoop-0.20.2-cdh3u3/

sudo pico conf/hadoop-env.sh
//add the following line in there:
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export JAVA_HOME="/usr/lib/jvm/java-6-openjdk


16. Configuring the core-site.xml

<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-datastore/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://<IP of namenode>:54310</value>
<description>Location of the Namenode</description>
</property>
</configuration>

17. Configuring the hdfs-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.</description>
</property>
</configuration>

18. Configuring the mapred-site.xml
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl"
href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value><IP of job tracker>:54311</value>
<description>Host and port of the jobtracker.
</description>
</property>
</configuration>

19. Add all the IP addresses in the conf/slaves file
sudo pico /usr/local/hadoop/hadoop-0.20.2-cdh3u2/conf/slaves
 Add the list of IP addresses that will host data nodes, in this file

---------------------------------------------------------------------------------------------------------------------------------------------

*Hadoop Commands: Now restart the hadoop cluster*
start-all.sh/stop-all.sh
start-dfs.sh/stop-dfs.sh
start-mapred.sh/stop-mapred.sh
hadoop dfs -ls /<virtual dfs path>
hadoop dfs copyFromLocal <local path> <dfs path>

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Posted by shashwat shriparv <dw...@gmail.com>.

Please send you conf file contents and host file contents too..


On Tue, May 29, 2012 at 11:08 PM, Harsh J <ha...@cloudera.com> wrote:

> Rohit,
>
> The SNN may start and run infinitely without doing any work. The NN
> and DN have probably not started cause the NN has an issue (perhaps NN
> name directory isn't formatted) and the DN can't find the NN (or has
> data directory issues as well).
>
> So this isn't a glitch but a real issue you'll have to take a look at
> your logs for.
>
> On Sun, May 27, 2012 at 10:51 PM, Rohit Pandey <ro...@gmail.com>
> wrote:
> > Hello Hadoop community,
> >
> > I have been trying to set up a double node Hadoop cluster (following
> > the instructions in -
> >
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
> )
> > and am very close to running it apart from one small glitch - when I
> > start the dfs (using start-dfs.sh), it says:
> >
> > 10.63.88.53: starting datanode, logging to
> > /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
> > 10.63.88.109: starting datanode, logging to
> >
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
> > 10.63.88.109: starting secondarynamenode, logging to
> >
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
> > starting jobtracker, logging to
> >
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
> > 10.63.88.109: starting tasktracker, logging to
> >
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
> > 10.63.88.53: starting tasktracker, logging to
> > /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out
> >
> > which looks like it's been successful in starting all the nodes.
> > However, when I check them out by running 'jps', this is what I see:
> > 27531 SecondaryNameNode
> > 27879 Jps
> >
> > As you can see, there is no datanode and name node. I have been
> > racking my brains at this for quite a while now. Checked all the
> > inputs and every thing. Any one know what the problem might be?
> >
> > --
> >
> > Thanks in advance,
> >
> > Rohit
>
>
>
> --
> Harsh J
>



-- 


∞
Shashwat Shriparv

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Posted by Harsh J <ha...@cloudera.com>.

Rohit,

The SNN may start and run infinitely without doing any work. The NN
and DN have probably not started cause the NN has an issue (perhaps NN
name directory isn't formatted) and the DN can't find the NN (or has
data directory issues as well).

So this isn't a glitch but a real issue you'll have to take a look at
your logs for.

On Sun, May 27, 2012 at 10:51 PM, Rohit Pandey <ro...@gmail.com> wrote:
> Hello Hadoop community,
>
> I have been trying to set up a double node Hadoop cluster (following
> the instructions in -
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/)
> and am very close to running it apart from one small glitch - when I
> start the dfs (using start-dfs.sh), it says:
>
> 10.63.88.53: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
> 10.63.88.109: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
> 10.63.88.109: starting secondarynamenode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
> starting jobtracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
> 10.63.88.109: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
> 10.63.88.53: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out
>
> which looks like it's been successful in starting all the nodes.
> However, when I check them out by running 'jps', this is what I see:
> 27531 SecondaryNameNode
> 27879 Jps
>
> As you can see, there is no datanode and name node. I have been
> racking my brains at this for quite a while now. Checked all the
> inputs and every thing. Any one know what the problem might be?
>
> --
>
> Thanks in advance,
>
> Rohit



-- 
Harsh J

Re: Small glitch with setting up two node cluster...only secondary node starts (datanode and namenode don't show up in jps)

Posted by samir das mohapatra <sa...@gmail.com>.

 In your logs details  i colud not find the NN stating.

 It is the Problem of NN itself.

  Harsh also  suggested for that same.

On Sun, May 27, 2012 at 10:51 PM, Rohit Pandey <ro...@gmail.com>wrote:

> Hello Hadoop community,
>
> I have been trying to set up a double node Hadoop cluster (following
> the instructions in -
>
> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
> )
> and am very close to running it apart from one small glitch - when I
> start the dfs (using start-dfs.sh), it says:
>
> 10.63.88.53: starting datanode, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-ubuntu.out
> 10.63.88.109: starting datanode, logging to
>
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-datanode-pandro51-OptiPlex-960.out
> 10.63.88.109: starting secondarynamenode, logging to
>
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-secondarynamenode-pandro51-OptiPlex-960.out
> starting jobtracker, logging to
>
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-jobtracker-pandro51-OptiPlex-960.out
> 10.63.88.109: starting tasktracker, logging to
>
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-pandro51-OptiPlex-960.out
> 10.63.88.53: starting tasktracker, logging to
> /usr/local/hadoop/bin/../logs/hadoop-pandro51-tasktracker-ubuntu.out
>
> which looks like it's been successful in starting all the nodes.
> However, when I check them out by running 'jps', this is what I see:
> 27531 SecondaryNameNode
> 27879 Jps
>
> As you can see, there is no datanode and name node. I have been
> racking my brains at this for quite a while now. Checked all the
> inputs and every thing. Any one know what the problem might be?
>
> --
>
> Thanks in advance,
>
> Rohit
>