You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Dhaya007 <mg...@gmail.com> on 2008/01/02 06:10:18 UTC

Not able to start Data Node

I am new to hadoop if any think wrong please correct me .... 
I Have configured a single/multi node cluster using following link 
http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29. 
I have followed the link but i am not able to start the haoop in multi node
environment 
The problems i am facing are as Follows:
1.I have configured master and slave nodes with ssh less pharase if try to
run the start-dfs.sh it prompt the password for master:slave machines.(I
have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key
file) 

2.After giving password datanode,namenode,jobtracker,tasktraker started
successfully in master but datanode is started in slave. 


3.Some time step 2 works and some time it says that permission denied. 

4.I have checked the log file in the slave for datanode it says that
incompatible node, then i have formated the slave, master and start the dfs
by start-dfs.sh still i am getting the error 


The host entry in etc/hosts are both master/slave 
master 
slave 
conf/masters
master 
conf/slaves
master 
slave 

The hadoop-site.xml  for both master/slave 
<?xml version="1.0"?> 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> 

<!-- Put site-specific property overrides in this file. --> 

<configuration> 
<property> 
  <name>hadoop.tmp.dir</name> 
  <value>/home/hdusr/hadoop-${user.name}</value> 
  <description>A base for other temporary directories.</description> 
</property> 
  
<property> 
  <name>fs.default.name</name> 
  <value>hdfs://master:54310</value> 
  <description>The name of the default file system.  A URI whose 
  scheme and authority determine the FileSystem implementation.  The 
  uri's scheme determines the config property (fs.SCHEME.impl) naming 
  the FileSystem implementation class.  The uri's authority is used to 
  determine the host, port, etc. for a filesystem.</description> 
</property> 
  
<property> 
  <name>mapred.job.tracker</name> 
  <value>master:54311</value> 
  <description>The host and port that the MapReduce job tracker runs 
  at.  If "local", then jobs are run in-process as a single map 
  and reduce task. 
  </description> 
</property> 
  
<property> 
  <name>dfs.replication</name> 
  <value>2</value> 
  <description>Default block replication. 
  The actual number of replications can be specified when the file is
created. 
  The default is used if replication is not specified in create time. 
  </description> 
</property> 

<property> 
  <name>mapred.map.tasks</name> 
  <value>20</value> 
  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
number of tasktrackers). 
  </description> 
</property> 

<property> 
  <name>mapred.reduce.tasks</name> 
  <value>4</value> 
  <description>As a rule of thumb, use 2x the number of slave processors
(i.e., number of tasktrackers). 
  </description> 
</property> 
</configuration> 

Please help me to reslove the same. Or else provide any other tutorial for
multi node cluster setup.I am egarly waiting for the tutorials. 


Thanks 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Not able to start Data Node

Posted by Khalil Honsali <k....@gmail.com>.

here's the reference I wanted to send you:
http://www.ibm.com/developerworks/eserver/library/es-ssh/index.html
go here
http://www.ibm.com/developerworks/eserver/library/es-ssh/index.html#figure12

hope it solves it

On 04/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>
>
> Thanks For your good solution
> I ill check and update the same
>
>
>
>
> Khalil Honsali wrote:
> >
> > I also note that for non-root passwordless ssh,  you must chmod
> > authorized_keys file to 655,
> >
> > On 03/01/2008, Miles Osborne <mi...@inf.ed.ac.uk> wrote:
> >>
> >> You need to make sure that each slave node has a copy of the authorised
> >> keys
> >> you generated on the master node.
> >>
> >> Miles
> >>
> >> On 03/01/2008, Dhaya007 <mg...@gmail.com> wrote:
> >> >
> >> >
> >> > Thanks Arun,
> >> >
> >> > I am able to riun the datanode in slave (As per the solution given by
> >> You
> >> > (listinig port ))
> >> >
> >> > But still it asks the pasword while starting the dfs ans mapreduce
> >> >
> >> > First i generated rsa as password less as follws
> >> >
> >> > ssh-keygen -t rsa -P ""
> >> > cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
> >> > ssh master
> >> > ssh slave
> >> > I started the dfs in master as follows
> >> > /bin/start-dfs.sh
> >> > it asks the passowrd
> >> > Please help me to resolve the same (I dont know i am doing right in
> the
> >> > case
> >> > of ssh)
> >> >
> >> >
> >> >
> >> > Dhaya007 wrote:
> >> > >
> >> > >
> >> > >
> >> > > Arun C Murthy wrote:
> >> > >>
> >> > >> What version of Hadoop are you running?
> >> > >> Dhaya007:hadoop-0.15.1
> >> > >>
> >> > >> http://wiki.apache.org/lucene-hadoop/Help
> >> > >>
> >> > >> Dhaya007 wrote:
> >> > >>  > ..datanode-slave.log
> >> > >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode:
> >> Invalid
> >> > >>> directory in dfs.data.dir: directory is not writable:
> >> > >>> /tmp/hadoop-hdpusr/dfs/data
> >> > >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
> >> > >>> directories in dfs.data.dir are invalid.
> >> > >>
> >> > >> Did you check that directory?
> >> > >> Daya007:Yes, i have checked the folder in which there is no file
> >> saved.
> >> > >>
> >> > >> DataNode is complaining that it doesn't have any 'valid'
> directories
> >> to
> >> > >> store data in.
> >> > >>
> >> > >>> Tasktracker_slav.log
> >> > >>> 2008-01-02 15:10:34,419 ERROR
> org.apache.hadoop.mapred.TaskTracker:
> >> > Can
> >> > >>> not
> >> > >>> start task tracker because java.net.UnknownHostException: unknown
> >> > host:
> >> > >>> localhost
> >> > >>>     at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java
> >> :136)
> >> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java
> :532)
> >> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >> > >>>     at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion
> (Unknown
> >> > Source)
> >> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> >> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
> >> > >>>     at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java
> >> :427)
> >> > >>>     at org.apache.hadoop.mapred.TaskTracker.<init>(
> TaskTracker.java
> >> > :717)
> >> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> >> > :1880)
> >> > >>>
> >> > >>
> >> > >> That probably means that the TaskTracker's hadoop-site.xml says
> that
> >> > >> 'localhost' is the JobTracker which isn't true...
> >> > >>
> >> > >> hadoop-site.xml is as follows
> >> > >> <?xml version="1.0"?>
> >> > >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >> > >>
> >> > >> <!-- Put site-specific property overrides in this file. -->
> >> > >>
> >> > >> <configuration>
> >> > >> <property>
> >> > >>   <name>hadoop.tmp.dir</name>
> >> > >>   <value>/home/hdusr/hadoop-${user.name}</value>
> >> > >>   <description>A base for other temporary
> directories.</description>
> >> > >> </property>
> >> > >>
> >> > >> <property>
> >> > >>   <name>fs.default.name</name>
> >> > >>   <value>hdfs://master:54310</value>
> >> > >>   <description>The name of the default file system.  A URI whose
> >> > >>   scheme and authority determine the FileSystem
> implementation.  The
> >> > >>   uri's scheme determines the config property (fs.SCHEME.impl)
> >> naming
> >> > >>   the FileSystem implementation class.  The uri's authority is
> used
> >> to
> >> > >>   determine the host, port, etc. for a filesystem.</description>
> >> > >> </property>
> >> > >>
> >> > >> <property>
> >> > >>   <name>mapred.job.tracker</name>
> >> > >>   <value>master:54311</value>
> >> > >>   <description>The host and port that the MapReduce job tracker
> runs
> >> > >>   at.  If "local", then jobs are run in-process as a single map
> >> > >>   and reduce task.
> >> > >>   </description>
> >> > >> </property>
> >> > >>
> >> > >> <property>
> >> > >>   <name>dfs.replication</name>
> >> > >>   <value>2</value>
> >> > >>   <description>Default block replication.
> >> > >>   The actual number of replications can be specified when the file
> >> is
> >> > >> created.
> >> > >>   The default is used if replication is not specified in create
> >> time.
> >> > >>   </description>
> >> > >> </property>
> >> > >>
> >> > >> <property>
> >> > >>   <name>mapred.map.tasks</name>
> >> > >>   <value>20</value>
> >> > >>   <description>As a rule of thumb, use 10x the number of slaves (
> i.e
> >> .,
> >> > >> number of tasktrackers).
> >> > >>   </description>
> >> > >> </property>
> >> > >>
> >> > >> <property>
> >> > >>   <name>mapred.reduce.tasks</name>
> >> > >>   <value>4</value>
> >> > >>   <description>As a rule of thumb, use 2x the number of slave
> >> > processors
> >> > >> (i.e., number of tasktrackers).
> >> > >>   </description>
> >> > >> </property>
> >> > >> </configuration>
> >> > >>
> >> > >>  > namenode-master.log
> >> > >>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage:
> >> Storage
> >> > >>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
> >> > >>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server:
> >> Stopping
> >> > >> server
> >> > >>  > on 54310
> >> > >>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
> >> > >>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
> >> > >>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state:
> storage
> >> > >> directory
> >> > >>  > does not exist or is not accessible.
> >> > >>
> >> > >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or
> isn't
> >> > >> accessible.
> >> > >>
> >> > >> Dhaya007 I have checked the name folder but i wont find any folder
> >> in
> >> > the
> >> > >> specified dir
> >> > >> -*-*-
> >> > >>
> >> > >> Overall, this looks like an acute case of
> wrong-configuration-itis.
> >> > >> Please provid the corect configuration site example for multi node
> >> > >> cluster other than
> >> > >>
> >> >
> >>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >> > >> because i followed the same
> >> > >>
> >> > >> Have you got the same hadoop-site.xml on all your nodes?
> >> > >> Dhaya007:Yes
> >> > >>
> >> > >> More info here:
> >> > >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
> >> > >> Dhaya007: I followed the same site you have mentioned but no
> >> solution
> >> > >>
> >> > >> Arun
> >> > >>
> >> > >>
> >> > >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > >>> SHUTDOWN_MSG:
> >> > >>> /************************************************************
> >> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
> >> > >>> ************************************************************/
> >> > >>>
> >> > >>>
> >> > >>> And all the ports are running
> >> > >>> Some time it asks password and some time it wont while starting
> the
> >> > dfs
> >> > >>>
> >> > >>> Master logs
> >> > >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
> >> > >>> SHUTDOWN_MSG:
> >> > >>> /************************************************************
> >> > >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
> >> > >>> ************************************************************/
> >> > >>>
> >> > >>> Datanode-master.log
> >> > >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
> >> > >>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
> >> > >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> >> > >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> >> > >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> >> > >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> >> > >>> ***********************************************
> >> > >>> Jobtracker_master.log
> >> > >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
> >> > >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker:
> >> > >>> problem
> >> > >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
> >> > >>> java.net.ConnectException: Connection refused
> >> > >>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
> >> > >>>     at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java
> :333)
> >> > >>>     at java.net.PlainSocketImpl.connectToAddress(
> >> PlainSocketImpl.java
> >> > :195)
> >> > >>>     at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> >> > >>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> >> > >>>     at java.net.Socket.connect(Socket.java:520)
> >> > >>>     at
> >> > >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(
> Client.java
> >> > :152)
> >> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java
> :542)
> >> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >> > >>>     at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown
> >> > Source)
> >> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> >> > >>>     at
> >> org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java
> >> > :147)
> >> > >>>     at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
> >> > >>>     at
> >> > >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(
> >> > DistributedFileSystem.java:65)
> >> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
> >> > >>>     at
> >> org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
> >> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
> >> > >>>     at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java
> >> :683)
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java
> >> :120)
> >> > >>>     at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java
> >> :2052)
> >> > >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283
> :
> >> > >>> error:
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> >> JobTracker.java
> >> > :1475)
> >> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown
> Source)
> >> > >>>     at
> >> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> >> > DelegatingMethodAccessorImpl.java:25)
> >> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >> > >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293
> :
> >> > >>> error:
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> >> JobTracker.java
> >> > :1475)
> >> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown
> Source)
> >> > >>>     at
> >> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> >> > DelegatingMethodAccessorImpl.java:25)
> >> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >> > >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> >> > >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304
> :
> >> > >>> error:
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> >> JobTracker.java
> >> > :1475)
> >> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown
> Source)
> >> > >>>     at
> >> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> >> > DelegatingMethodAccessorImpl.java:25)
> >> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >> > >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> >> > >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> >> > >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> >> > >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
> >> > >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
> >> > >>> SHUTDOWN_MSG:
> >> > >>> /************************************************************
> >> > >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
> >> > >>> ************************************************************/
> >> > >>>
> >> > >>> Tasktracker_master.log
> >> > >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client:
> Retrying
> >> > >>> connect
> >> > >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
> >> > >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > >>> STARTUP_MSG:
> >> > >>> /************************************************************
> >> > >>> STARTUP_MSG: Starting TaskTracker
> >> > >>> STARTUP_MSG:   host = master/172.16.0.25
> >> > >>> STARTUP_MSG:   args = []
> >> > >>> ************************************************************/
> >> > >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential:
> Checking
> >> > >>> Resource
> >> > >>> aliases
> >> > >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
> >> > >>> Jetty/5.1.4
> >> > >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
> >> > >>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
> >> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >> > >>> WebApplicationContext[/,/]
> >> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >> > >>> HttpContext[/logs,/logs]
> >> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >> > >>> HttpContext[/static,/static]
> >> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener:
> >> Started
> >> > >>> SocketListener on 0.0.0.0:50060
> >> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
> >> > >>> org.mortbay.jetty.Server@1431340
> >> > >>> 2008-01-02 16:28:35,383 INFO
> >> org.apache.hadoop.metrics.jvm.JvmMetrics:
> >> > >>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
> >> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > >>> TaskTracker up at: /127.0.0.1:49599
> >> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > >>> Starting
> >> > >>> tracker tracker_master:/127.0.0.1:49599
> >> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> listener on 49599: starting
> >> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> handler 0 on 49599: starting
> >> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> >> Server
> >> > >>> handler 1 on 49599: starting
> >> > >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > >>> Starting
> >> > >>> thread: Map-events fetcher for all reduce tasks on
> >> > >>> tracker_master:/127.0.0.1:49599
> >> > >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker
> :
> >> > Lost
> >> > >>> connection to JobTracker
> [localhost/127.0.0.1:54311].  Retrying...
> >> > >>> org.apache.hadoop.ipc.RemoteException:
> >> > >>>
> >> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> >> > >>> object
> >> > >>> not available yet
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> >> JobTracker.java
> >> > :1475)
> >> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> >> > >>>     at
> >> > >>> sun.reflect.NativeMethodAccessorImpl.invoke(
> >> > NativeMethodAccessorImpl.java:39)
> >> > >>>     at
> >> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> >> > DelegatingMethodAccessorImpl.java:25)
> >> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >> > >>>
> >> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:482)
> >> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >> > >>>     at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown
> >> > Source)
> >> > >>>     at
> >> > >>> org.apache.hadoop.mapred.TaskTracker.offerService(
> TaskTracker.java
> >> > :773)
> >> > >>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java
> >> :1179)
> >> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> >> > :1880)
> >> > >>> *******************************************
> >> > >>>
> >> > >>> Please help me to resolve the same.
> >> > >>>
> >> > >>>
> >> > >>> Khalil Honsali wrote:
> >> > >>>
> >> > >>>>Hi,
> >> > >>>>
> >> > >>>>I think you need to post more information, for example an excerpt
> >> of
> >> > the
> >> > >>>>failing datanode log. Also, please clarify the issue of
> >> connectivity:
> >> > >>>>- are you able to ssh passwordless (from master to slave, slave
> to
> >> > master,
> >> > >>>>slave to slave, master to master), you shouldn't be entering
> >> passwrd
> >> > >>>>everytime...
> >> > >>>>- are you able to telnet (not necessary but preferred)
> >> > >>>>- have you verified the ports as RUNNING on using netstat
> command?
> >> > >>>>
> >> > >>>>besides, the tasktracker starts ok but not the datanode?
> >> > >>>>
> >> > >>>>K. Honsali
> >> > >>>>
> >> > >>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
> >> > >>>>
> >> > >>>>>
> >> > >>>>>I am new to hadoop if any think wrong please correct me ....
> >> > >>>>>I Have configured a single/multi node cluster using following
> link
> >> > >>>>>
> >> > >>>>>
> >> >
> >>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >> > >>>>>.
> >> > >>>>>I have followed the link but i am not able to start the haoop in
> >> > multi
> >> > >>>>>node
> >> > >>>>>environment
> >> > >>>>>The problems i am facing are as Follows:
> >> > >>>>>1.I have configured master and slave nodes with ssh less pharase
> >> if
> >> > try
> >> > >>>>>to
> >> > >>>>>run the start-dfs.sh it prompt the password for master:slave
> >> > machines.(I
> >> > >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
> >> > autherized_key
> >> > >>>>>file)
> >> > >>>>>
> >> > >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker
> >> > started
> >> > >>>>>successfully in master but datanode is started in slave.
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>3.Some time step 2 works and some time it says that permission
> >> > denied.
> >> > >>>>>
> >> > >>>>>4.I have checked the log file in the slave for datanode it says
> >> that
> >> > >>>>>incompatible node, then i have formated the slave, master and
> >> start
> >> > the
> >> > >>>>>dfs
> >> > >>>>>by start-dfs.sh still i am getting the error
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>The host entry in etc/hosts are both master/slave
> >> > >>>>>master
> >> > >>>>>slave
> >> > >>>>>conf/masters
> >> > >>>>>master
> >> > >>>>>conf/slaves
> >> > >>>>>master
> >> > >>>>>slave
> >> > >>>>>
> >> > >>>>>The hadoop-site.xml  for both master/slave
> >> > >>>>><?xml version="1.0"?>
> >> > >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >> > >>>>>
> >> > >>>>><!-- Put site-specific property overrides in this file. -->
> >> > >>>>>
> >> > >>>>><configuration>
> >> > >>>>><property>
> >> > >>>>>  <name>hadoop.tmp.dir</name>
> >> > >>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
> >> > >>>>>  <description>A base for other temporary
> >> directories.</description>
> >> > >>>>></property>
> >> > >>>>>
> >> > >>>>><property>
> >> > >>>>>  <name>fs.default.name</name>
> >> > >>>>>  <value>hdfs://master:54310</value>
> >> > >>>>>  <description>The name of the default file system.  A URI whose
> >> > >>>>>  scheme and authority determine the FileSystem
> >> implementation.  The
> >> > >>>>>  uri's scheme determines the config property (fs.SCHEME.impl)
> >> naming
> >> > >>>>>  the FileSystem implementation class.  The uri's authority is
> >> used
> >> > to
> >> > >>>>>  determine the host, port, etc. for a filesystem.</description>
> >> > >>>>></property>
> >> > >>>>>
> >> > >>>>><property>
> >> > >>>>>  <name>mapred.job.tracker</name>
> >> > >>>>>  <value>master:54311</value>
> >> > >>>>>  <description>The host and port that the MapReduce job tracker
> >> runs
> >> > >>>>>  at.  If "local", then jobs are run in-process as a single map
> >> > >>>>>  and reduce task.
> >> > >>>>>  </description>
> >> > >>>>></property>
> >> > >>>>>
> >> > >>>>><property>
> >> > >>>>>  <name>dfs.replication</name>
> >> > >>>>>  <value>2</value>
> >> > >>>>>  <description>Default block replication.
> >> > >>>>>  The actual number of replications can be specified when the
> file
> >> is
> >> > >>>>>created.
> >> > >>>>>  The default is used if replication is not specified in create
> >> time.
> >> > >>>>>  </description>
> >> > >>>>></property>
> >> > >>>>>
> >> > >>>>><property>
> >> > >>>>>  <name>mapred.map.tasks</name>
> >> > >>>>>  <value>20</value>
> >> > >>>>>  <description>As a rule of thumb, use 10x the number of slaves
> (
> >> i.e
> >> > .,
> >> > >>>>>number of tasktrackers).
> >> > >>>>>  </description>
> >> > >>>>></property>
> >> > >>>>>
> >> > >>>>><property>
> >> > >>>>>  <name>mapred.reduce.tasks</name>
> >> > >>>>>  <value>4</value>
> >> > >>>>>  <description>As a rule of thumb, use 2x the number of slave
> >> > >>>>> processors
> >> > >>>>>(i.e., number of tasktrackers).
> >> > >>>>>  </description>
> >> > >>>>></property>
> >> > >>>>></configuration>
> >> > >>>>>
> >> > >>>>>Please help me to reslove the same. Or else provide any other
> >> > tutorial
> >> > >>>>>for
> >> > >>>>>multi node cluster setup.I am egarly waiting for the tutorials.
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>Thanks
> >> > >>>>>
> >> > >>>>>--
> >> > >>>>>View this message in context:
> >> > >>>>>
> >> >
> >>
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
> >> > >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
> >> > >>>>>
> >> > >>>>>
> >> > >>>>
> >> > >>>>
> >> > >>>
> >> > >>
> >> > >>
> >> > >>
> >> > >
> >> > >
> >> >
> >> > --
> >> > View this message in context:
> >> >
> >>
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html
> >> > Sent from the Hadoop Users mailing list archive at Nabble.com.
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > ---------------------------------------------------------
> > شهر مبارك كريم
> > كل عام و أنتم بخير
> > ---------------------------------------------------------
> > Honsali Khalil − 本査理 カリル
> > Academic>Japan>NIT>Grad. Sc. Eng.>Dept. CS>Matsuo&Tsumura Lab.
> > http://www.matlab.nitech.ac.jp/~k-hon/
> > +81 (zero-)eight-zero 5134 8119
> > k.honsali@ezweb.ne.jp (instant reply mail)
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14613800.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>


-- 
---------------------------------------------------------
شهر مبارك كريم
كل عام و أنتم بخير
---------------------------------------------------------
Honsali Khalil − 本査理 カリル
Academic>Japan>NIT>Grad. Sc. Eng.>Dept. CS>Matsuo&Tsumura Lab.
http://www.matlab.nitech.ac.jp/~k-hon/
+81 (zero-)eight-zero 5134 8119
k.honsali@ezweb.ne.jp (instant reply mail)

Re: Not able to start Data Node

Posted by Dhaya007 <mg...@gmail.com>.

Thanks For your good solution 
I ill check and update the same 




Khalil Honsali wrote:
> 
> I also note that for non-root passwordless ssh,  you must chmod
> authorized_keys file to 655,
> 
> On 03/01/2008, Miles Osborne <mi...@inf.ed.ac.uk> wrote:
>>
>> You need to make sure that each slave node has a copy of the authorised
>> keys
>> you generated on the master node.
>>
>> Miles
>>
>> On 03/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>> >
>> >
>> > Thanks Arun,
>> >
>> > I am able to riun the datanode in slave (As per the solution given by
>> You
>> > (listinig port ))
>> >
>> > But still it asks the pasword while starting the dfs ans mapreduce
>> >
>> > First i generated rsa as password less as follws
>> >
>> > ssh-keygen -t rsa -P ""
>> > cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
>> > ssh master
>> > ssh slave
>> > I started the dfs in master as follows
>> > /bin/start-dfs.sh
>> > it asks the passowrd
>> > Please help me to resolve the same (I dont know i am doing right in the
>> > case
>> > of ssh)
>> >
>> >
>> >
>> > Dhaya007 wrote:
>> > >
>> > >
>> > >
>> > > Arun C Murthy wrote:
>> > >>
>> > >> What version of Hadoop are you running?
>> > >> Dhaya007:hadoop-0.15.1
>> > >>
>> > >> http://wiki.apache.org/lucene-hadoop/Help
>> > >>
>> > >> Dhaya007 wrote:
>> > >>  > ..datanode-slave.log
>> > >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode:
>> Invalid
>> > >>> directory in dfs.data.dir: directory is not writable:
>> > >>> /tmp/hadoop-hdpusr/dfs/data
>> > >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
>> > >>> directories in dfs.data.dir are invalid.
>> > >>
>> > >> Did you check that directory?
>> > >> Daya007:Yes, i have checked the folder in which there is no file
>> saved.
>> > >>
>> > >> DataNode is complaining that it doesn't have any 'valid' directories
>> to
>> > >> store data in.
>> > >>
>> > >>> Tasktracker_slav.log
>> > >>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker:
>> > Can
>> > >>> not
>> > >>> start task tracker because java.net.UnknownHostException: unknown
>> > host:
>> > >>> localhost
>> > >>>     at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java
>> :136)
>> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> > >>>     at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown
>> > Source)
>> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
>> > >>>     at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java
>> :427)
>> > >>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java
>> > :717)
>> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
>> > :1880)
>> > >>>
>> > >>
>> > >> That probably means that the TaskTracker's hadoop-site.xml says that
>> > >> 'localhost' is the JobTracker which isn't true...
>> > >>
>> > >> hadoop-site.xml is as follows
>> > >> <?xml version="1.0"?>
>> > >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> > >>
>> > >> <!-- Put site-specific property overrides in this file. -->
>> > >>
>> > >> <configuration>
>> > >> <property>
>> > >>   <name>hadoop.tmp.dir</name>
>> > >>   <value>/home/hdusr/hadoop-${user.name}</value>
>> > >>   <description>A base for other temporary directories.</description>
>> > >> </property>
>> > >>
>> > >> <property>
>> > >>   <name>fs.default.name</name>
>> > >>   <value>hdfs://master:54310</value>
>> > >>   <description>The name of the default file system.  A URI whose
>> > >>   scheme and authority determine the FileSystem implementation.  The
>> > >>   uri's scheme determines the config property (fs.SCHEME.impl)
>> naming
>> > >>   the FileSystem implementation class.  The uri's authority is used
>> to
>> > >>   determine the host, port, etc. for a filesystem.</description>
>> > >> </property>
>> > >>
>> > >> <property>
>> > >>   <name>mapred.job.tracker</name>
>> > >>   <value>master:54311</value>
>> > >>   <description>The host and port that the MapReduce job tracker runs
>> > >>   at.  If "local", then jobs are run in-process as a single map
>> > >>   and reduce task.
>> > >>   </description>
>> > >> </property>
>> > >>
>> > >> <property>
>> > >>   <name>dfs.replication</name>
>> > >>   <value>2</value>
>> > >>   <description>Default block replication.
>> > >>   The actual number of replications can be specified when the file
>> is
>> > >> created.
>> > >>   The default is used if replication is not specified in create
>> time.
>> > >>   </description>
>> > >> </property>
>> > >>
>> > >> <property>
>> > >>   <name>mapred.map.tasks</name>
>> > >>   <value>20</value>
>> > >>   <description>As a rule of thumb, use 10x the number of slaves (i.e
>> .,
>> > >> number of tasktrackers).
>> > >>   </description>
>> > >> </property>
>> > >>
>> > >> <property>
>> > >>   <name>mapred.reduce.tasks</name>
>> > >>   <value>4</value>
>> > >>   <description>As a rule of thumb, use 2x the number of slave
>> > processors
>> > >> (i.e., number of tasktrackers).
>> > >>   </description>
>> > >> </property>
>> > >> </configuration>
>> > >>
>> > >>  > namenode-master.log
>> > >>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage:
>> Storage
>> > >>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
>> > >>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server:
>> Stopping
>> > >> server
>> > >>  > on 54310
>> > >>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
>> > >>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
>> > >>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage
>> > >> directory
>> > >>  > does not exist or is not accessible.
>> > >>
>> > >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't
>> > >> accessible.
>> > >>
>> > >> Dhaya007 I have checked the name folder but i wont find any folder
>> in
>> > the
>> > >> specified dir
>> > >> -*-*-
>> > >>
>> > >> Overall, this looks like an acute case of wrong-configuration-itis.
>> > >> Please provid the corect configuration site example for multi node
>> > >> cluster other than
>> > >>
>> >
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>> > >> because i followed the same
>> > >>
>> > >> Have you got the same hadoop-site.xml on all your nodes?
>> > >> Dhaya007:Yes
>> > >>
>> > >> More info here:
>> > >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
>> > >> Dhaya007: I followed the same site you have mentioned but no
>> solution
>> > >>
>> > >> Arun
>> > >>
>> > >>
>> > >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
>> > >>> SHUTDOWN_MSG:
>> > >>> /************************************************************
>> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
>> > >>> ************************************************************/
>> > >>>
>> > >>>
>> > >>> And all the ports are running
>> > >>> Some time it asks password and some time it wont while starting the
>> > dfs
>> > >>>
>> > >>> Master logs
>> > >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
>> > >>> SHUTDOWN_MSG:
>> > >>> /************************************************************
>> > >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
>> > >>> ************************************************************/
>> > >>>
>> > >>> Datanode-master.log
>> > >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
>> > >>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
>> > >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> > >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> > >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> > >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> > >>> ***********************************************
>> > >>> Jobtracker_master.log
>> > >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
>> > >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker:
>> > >>> problem
>> > >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
>> > >>> java.net.ConnectException: Connection refused
>> > >>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>> > >>>     at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> > >>>     at java.net.PlainSocketImpl.connectToAddress(
>> PlainSocketImpl.java
>> > :195)
>> > >>>     at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> > >>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> > >>>     at java.net.Socket.connect(Socket.java:520)
>> > >>>     at
>> > >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java
>> > :152)
>> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> > >>>     at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown
>> > Source)
>> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> > >>>     at
>> org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java
>> > :147)
>> > >>>     at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
>> > >>>     at
>> > >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(
>> > DistributedFileSystem.java:65)
>> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
>> > >>>     at
>> org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
>> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
>> > >>>     at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java
>> :683)
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java
>> :120)
>> > >>>     at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java
>> :2052)
>> > >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283:
>> > >>> error:
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
>> JobTracker.java
>> > :1475)
>> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> > >>>     at
>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> > DelegatingMethodAccessorImpl.java:25)
>> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
>> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> > >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293:
>> > >>> error:
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
>> JobTracker.java
>> > :1475)
>> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> > >>>     at
>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> > DelegatingMethodAccessorImpl.java:25)
>> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
>> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> > >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> > >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304:
>> > >>> error:
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
>> JobTracker.java
>> > :1475)
>> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> > >>>     at
>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> > DelegatingMethodAccessorImpl.java:25)
>> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
>> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> > >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> > >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> > >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> > >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
>> > >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
>> > >>> SHUTDOWN_MSG:
>> > >>> /************************************************************
>> > >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
>> > >>> ************************************************************/
>> > >>>
>> > >>> Tasktracker_master.log
>> > >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >>> connect
>> > >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
>> > >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
>> > >>> STARTUP_MSG:
>> > >>> /************************************************************
>> > >>> STARTUP_MSG: Starting TaskTracker
>> > >>> STARTUP_MSG:   host = master/172.16.0.25
>> > >>> STARTUP_MSG:   args = []
>> > >>> ************************************************************/
>> > >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
>> > >>> Resource
>> > >>> aliases
>> > >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
>> > >>> Jetty/5.1.4
>> > >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
>> > >>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
>> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> > >>> WebApplicationContext[/,/]
>> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> > >>> HttpContext[/logs,/logs]
>> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> > >>> HttpContext[/static,/static]
>> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener:
>> Started
>> > >>> SocketListener on 0.0.0.0:50060
>> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
>> > >>> org.mortbay.jetty.Server@1431340
>> > >>> 2008-01-02 16:28:35,383 INFO
>> org.apache.hadoop.metrics.jvm.JvmMetrics:
>> > >>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
>> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> > >>> TaskTracker up at: /127.0.0.1:49599
>> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> > >>> Starting
>> > >>> tracker tracker_master:/127.0.0.1:49599
>> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> listener on 49599: starting
>> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> handler 0 on 49599: starting
>> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
>> Server
>> > >>> handler 1 on 49599: starting
>> > >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
>> > >>> Starting
>> > >>> thread: Map-events fetcher for all reduce tasks on
>> > >>> tracker_master:/127.0.0.1:49599
>> > >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker:
>> > Lost
>> > >>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
>> > >>> org.apache.hadoop.ipc.RemoteException:
>> > >>>
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
>> > >>> object
>> > >>> not available yet
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
>> JobTracker.java
>> > :1475)
>> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > >>>     at
>> > >>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> > NativeMethodAccessorImpl.java:39)
>> > >>>     at
>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> > DelegatingMethodAccessorImpl.java:25)
>> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
>> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> > >>>
>> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:482)
>> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> > >>>     at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown
>> > Source)
>> > >>>     at
>> > >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java
>> > :773)
>> > >>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java
>> :1179)
>> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
>> > :1880)
>> > >>> *******************************************
>> > >>>
>> > >>> Please help me to resolve the same.
>> > >>>
>> > >>>
>> > >>> Khalil Honsali wrote:
>> > >>>
>> > >>>>Hi,
>> > >>>>
>> > >>>>I think you need to post more information, for example an excerpt
>> of
>> > the
>> > >>>>failing datanode log. Also, please clarify the issue of
>> connectivity:
>> > >>>>- are you able to ssh passwordless (from master to slave, slave to
>> > master,
>> > >>>>slave to slave, master to master), you shouldn't be entering
>> passwrd
>> > >>>>everytime...
>> > >>>>- are you able to telnet (not necessary but preferred)
>> > >>>>- have you verified the ports as RUNNING on using netstat command?
>> > >>>>
>> > >>>>besides, the tasktracker starts ok but not the datanode?
>> > >>>>
>> > >>>>K. Honsali
>> > >>>>
>> > >>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>> > >>>>
>> > >>>>>
>> > >>>>>I am new to hadoop if any think wrong please correct me ....
>> > >>>>>I Have configured a single/multi node cluster using following link
>> > >>>>>
>> > >>>>>
>> >
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>> > >>>>>.
>> > >>>>>I have followed the link but i am not able to start the haoop in
>> > multi
>> > >>>>>node
>> > >>>>>environment
>> > >>>>>The problems i am facing are as Follows:
>> > >>>>>1.I have configured master and slave nodes with ssh less pharase
>> if
>> > try
>> > >>>>>to
>> > >>>>>run the start-dfs.sh it prompt the password for master:slave
>> > machines.(I
>> > >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
>> > autherized_key
>> > >>>>>file)
>> > >>>>>
>> > >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker
>> > started
>> > >>>>>successfully in master but datanode is started in slave.
>> > >>>>>
>> > >>>>>
>> > >>>>>3.Some time step 2 works and some time it says that permission
>> > denied.
>> > >>>>>
>> > >>>>>4.I have checked the log file in the slave for datanode it says
>> that
>> > >>>>>incompatible node, then i have formated the slave, master and
>> start
>> > the
>> > >>>>>dfs
>> > >>>>>by start-dfs.sh still i am getting the error
>> > >>>>>
>> > >>>>>
>> > >>>>>The host entry in etc/hosts are both master/slave
>> > >>>>>master
>> > >>>>>slave
>> > >>>>>conf/masters
>> > >>>>>master
>> > >>>>>conf/slaves
>> > >>>>>master
>> > >>>>>slave
>> > >>>>>
>> > >>>>>The hadoop-site.xml  for both master/slave
>> > >>>>><?xml version="1.0"?>
>> > >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> > >>>>>
>> > >>>>><!-- Put site-specific property overrides in this file. -->
>> > >>>>>
>> > >>>>><configuration>
>> > >>>>><property>
>> > >>>>>  <name>hadoop.tmp.dir</name>
>> > >>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
>> > >>>>>  <description>A base for other temporary
>> directories.</description>
>> > >>>>></property>
>> > >>>>>
>> > >>>>><property>
>> > >>>>>  <name>fs.default.name</name>
>> > >>>>>  <value>hdfs://master:54310</value>
>> > >>>>>  <description>The name of the default file system.  A URI whose
>> > >>>>>  scheme and authority determine the FileSystem
>> implementation.  The
>> > >>>>>  uri's scheme determines the config property (fs.SCHEME.impl)
>> naming
>> > >>>>>  the FileSystem implementation class.  The uri's authority is
>> used
>> > to
>> > >>>>>  determine the host, port, etc. for a filesystem.</description>
>> > >>>>></property>
>> > >>>>>
>> > >>>>><property>
>> > >>>>>  <name>mapred.job.tracker</name>
>> > >>>>>  <value>master:54311</value>
>> > >>>>>  <description>The host and port that the MapReduce job tracker
>> runs
>> > >>>>>  at.  If "local", then jobs are run in-process as a single map
>> > >>>>>  and reduce task.
>> > >>>>>  </description>
>> > >>>>></property>
>> > >>>>>
>> > >>>>><property>
>> > >>>>>  <name>dfs.replication</name>
>> > >>>>>  <value>2</value>
>> > >>>>>  <description>Default block replication.
>> > >>>>>  The actual number of replications can be specified when the file
>> is
>> > >>>>>created.
>> > >>>>>  The default is used if replication is not specified in create
>> time.
>> > >>>>>  </description>
>> > >>>>></property>
>> > >>>>>
>> > >>>>><property>
>> > >>>>>  <name>mapred.map.tasks</name>
>> > >>>>>  <value>20</value>
>> > >>>>>  <description>As a rule of thumb, use 10x the number of slaves (
>> i.e
>> > .,
>> > >>>>>number of tasktrackers).
>> > >>>>>  </description>
>> > >>>>></property>
>> > >>>>>
>> > >>>>><property>
>> > >>>>>  <name>mapred.reduce.tasks</name>
>> > >>>>>  <value>4</value>
>> > >>>>>  <description>As a rule of thumb, use 2x the number of slave
>> > >>>>> processors
>> > >>>>>(i.e., number of tasktrackers).
>> > >>>>>  </description>
>> > >>>>></property>
>> > >>>>></configuration>
>> > >>>>>
>> > >>>>>Please help me to reslove the same. Or else provide any other
>> > tutorial
>> > >>>>>for
>> > >>>>>multi node cluster setup.I am egarly waiting for the tutorials.
>> > >>>>>
>> > >>>>>
>> > >>>>>Thanks
>> > >>>>>
>> > >>>>>--
>> > >>>>>View this message in context:
>> > >>>>>
>> >
>> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>> > >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>> > >>>>>
>> > >>>>>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>
>> > >>
>> > >>
>> > >
>> > >
>> >
>> > --
>> > View this message in context:
>> >
>> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html
>> > Sent from the Hadoop Users mailing list archive at Nabble.com.
>> >
>> >
>>
> 
> 
> 
> -- 
> ---------------------------------------------------------
> شهر مبارك كريم
> كل عام و أنتم بخير
> ---------------------------------------------------------
> Honsali Khalil − 本査理 カリル
> Academic>Japan>NIT>Grad. Sc. Eng.>Dept. CS>Matsuo&Tsumura Lab.
> http://www.matlab.nitech.ac.jp/~k-hon/
> +81 (zero-)eight-zero 5134 8119
> k.honsali@ezweb.ne.jp (instant reply mail)
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14613800.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Not able to start Data Node

Posted by Khalil Honsali <k....@gmail.com>.

I also note that for non-root passwordless ssh,  you must chmod
authorized_keys file to 655,

On 03/01/2008, Miles Osborne <mi...@inf.ed.ac.uk> wrote:
>
> You need to make sure that each slave node has a copy of the authorised
> keys
> you generated on the master node.
>
> Miles
>
> On 03/01/2008, Dhaya007 <mg...@gmail.com> wrote:
> >
> >
> > Thanks Arun,
> >
> > I am able to riun the datanode in slave (As per the solution given by
> You
> > (listinig port ))
> >
> > But still it asks the pasword while starting the dfs ans mapreduce
> >
> > First i generated rsa as password less as follws
> >
> > ssh-keygen -t rsa -P ""
> > cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
> > ssh master
> > ssh slave
> > I started the dfs in master as follows
> > /bin/start-dfs.sh
> > it asks the passowrd
> > Please help me to resolve the same (I dont know i am doing right in the
> > case
> > of ssh)
> >
> >
> >
> > Dhaya007 wrote:
> > >
> > >
> > >
> > > Arun C Murthy wrote:
> > >>
> > >> What version of Hadoop are you running?
> > >> Dhaya007:hadoop-0.15.1
> > >>
> > >> http://wiki.apache.org/lucene-hadoop/Help
> > >>
> > >> Dhaya007 wrote:
> > >>  > ..datanode-slave.log
> > >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
> > >>> directory in dfs.data.dir: directory is not writable:
> > >>> /tmp/hadoop-hdpusr/dfs/data
> > >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
> > >>> directories in dfs.data.dir are invalid.
> > >>
> > >> Did you check that directory?
> > >> Daya007:Yes, i have checked the folder in which there is no file
> saved.
> > >>
> > >> DataNode is complaining that it doesn't have any 'valid' directories
> to
> > >> store data in.
> > >>
> > >>> Tasktracker_slav.log
> > >>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker:
> > Can
> > >>> not
> > >>> start task tracker because java.net.UnknownHostException: unknown
> > host:
> > >>> localhost
> > >>>     at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java
> :136)
> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> > >>>     at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown
> > Source)
> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
> > >>>     at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
> > >>>     at
> > >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java
> :427)
> > >>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java
> > :717)
> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> > :1880)
> > >>>
> > >>
> > >> That probably means that the TaskTracker's hadoop-site.xml says that
> > >> 'localhost' is the JobTracker which isn't true...
> > >>
> > >> hadoop-site.xml is as follows
> > >> <?xml version="1.0"?>
> > >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> > >>
> > >> <!-- Put site-specific property overrides in this file. -->
> > >>
> > >> <configuration>
> > >> <property>
> > >>   <name>hadoop.tmp.dir</name>
> > >>   <value>/home/hdusr/hadoop-${user.name}</value>
> > >>   <description>A base for other temporary directories.</description>
> > >> </property>
> > >>
> > >> <property>
> > >>   <name>fs.default.name</name>
> > >>   <value>hdfs://master:54310</value>
> > >>   <description>The name of the default file system.  A URI whose
> > >>   scheme and authority determine the FileSystem implementation.  The
> > >>   uri's scheme determines the config property (fs.SCHEME.impl) naming
> > >>   the FileSystem implementation class.  The uri's authority is used
> to
> > >>   determine the host, port, etc. for a filesystem.</description>
> > >> </property>
> > >>
> > >> <property>
> > >>   <name>mapred.job.tracker</name>
> > >>   <value>master:54311</value>
> > >>   <description>The host and port that the MapReduce job tracker runs
> > >>   at.  If "local", then jobs are run in-process as a single map
> > >>   and reduce task.
> > >>   </description>
> > >> </property>
> > >>
> > >> <property>
> > >>   <name>dfs.replication</name>
> > >>   <value>2</value>
> > >>   <description>Default block replication.
> > >>   The actual number of replications can be specified when the file is
> > >> created.
> > >>   The default is used if replication is not specified in create time.
> > >>   </description>
> > >> </property>
> > >>
> > >> <property>
> > >>   <name>mapred.map.tasks</name>
> > >>   <value>20</value>
> > >>   <description>As a rule of thumb, use 10x the number of slaves (i.e
> .,
> > >> number of tasktrackers).
> > >>   </description>
> > >> </property>
> > >>
> > >> <property>
> > >>   <name>mapred.reduce.tasks</name>
> > >>   <value>4</value>
> > >>   <description>As a rule of thumb, use 2x the number of slave
> > processors
> > >> (i.e., number of tasktrackers).
> > >>   </description>
> > >> </property>
> > >> </configuration>
> > >>
> > >>  > namenode-master.log
> > >>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage:
> Storage
> > >>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
> > >>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server:
> Stopping
> > >> server
> > >>  > on 54310
> > >>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
> > >>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
> > >>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage
> > >> directory
> > >>  > does not exist or is not accessible.
> > >>
> > >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't
> > >> accessible.
> > >>
> > >> Dhaya007 I have checked the name folder but i wont find any folder in
> > the
> > >> specified dir
> > >> -*-*-
> > >>
> > >> Overall, this looks like an acute case of wrong-configuration-itis.
> > >> Please provid the corect configuration site example for multi node
> > >> cluster other than
> > >>
> >
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> > >> because i followed the same
> > >>
> > >> Have you got the same hadoop-site.xml on all your nodes?
> > >> Dhaya007:Yes
> > >>
> > >> More info here:
> > >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
> > >> Dhaya007: I followed the same site you have mentioned but no solution
> > >>
> > >> Arun
> > >>
> > >>
> > >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
> > >>> SHUTDOWN_MSG:
> > >>> /************************************************************
> > >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
> > >>> ************************************************************/
> > >>>
> > >>>
> > >>> And all the ports are running
> > >>> Some time it asks password and some time it wont while starting the
> > dfs
> > >>>
> > >>> Master logs
> > >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
> > >>> SHUTDOWN_MSG:
> > >>> /************************************************************
> > >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
> > >>> ************************************************************/
> > >>>
> > >>> Datanode-master.log
> > >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
> > >>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
> > >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> > >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> > >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> > >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> > >>> ***********************************************
> > >>> Jobtracker_master.log
> > >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
> > >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker:
> > >>> problem
> > >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
> > >>> java.net.ConnectException: Connection refused
> > >>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
> > >>>     at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> > >>>     at java.net.PlainSocketImpl.connectToAddress(
> PlainSocketImpl.java
> > :195)
> > >>>     at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> > >>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> > >>>     at java.net.Socket.connect(Socket.java:520)
> > >>>     at
> > >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java
> > :152)
> > >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> > >>>     at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown
> > Source)
> > >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> > >>>     at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java
> > :147)
> > >>>     at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
> > >>>     at
> > >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(
> > DistributedFileSystem.java:65)
> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
> > >>>     at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
> > >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
> > >>>     at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java
> :683)
> > >>>     at
> > >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java
> :120)
> > >>>     at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java
> :2052)
> > >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283:
> > >>> error:
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>>     at
> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> JobTracker.java
> > :1475)
> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> > >>>     at
> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> > >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293:
> > >>> error:
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>>     at
> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> JobTracker.java
> > :1475)
> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> > >>>     at
> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> > >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> > >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304:
> > >>> error:
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>>     at
> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> JobTracker.java
> > :1475)
> > >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> > >>>     at
> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> > >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> > >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> > >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> > >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
> > >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
> > >>> SHUTDOWN_MSG:
> > >>> /************************************************************
> > >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
> > >>> ************************************************************/
> > >>>
> > >>> Tasktracker_master.log
> > >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
> > >>> connect
> > >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
> > >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
> > >>> STARTUP_MSG:
> > >>> /************************************************************
> > >>> STARTUP_MSG: Starting TaskTracker
> > >>> STARTUP_MSG:   host = master/172.16.0.25
> > >>> STARTUP_MSG:   args = []
> > >>> ************************************************************/
> > >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
> > >>> Resource
> > >>> aliases
> > >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
> > >>> Jetty/5.1.4
> > >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
> > >>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> > >>> WebApplicationContext[/,/]
> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> > >>> HttpContext[/logs,/logs]
> > >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> > >>> HttpContext[/static,/static]
> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener:
> Started
> > >>> SocketListener on 0.0.0.0:50060
> > >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
> > >>> org.mortbay.jetty.Server@1431340
> > >>> 2008-01-02 16:28:35,383 INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics:
> > >>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
> > >>> TaskTracker up at: /127.0.0.1:49599
> > >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
> > >>> Starting
> > >>> tracker tracker_master:/127.0.0.1:49599
> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> listener on 49599: starting
> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> handler 0 on 49599: starting
> > >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC
> Server
> > >>> handler 1 on 49599: starting
> > >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
> > >>> Starting
> > >>> thread: Map-events fetcher for all reduce tasks on
> > >>> tracker_master:/127.0.0.1:49599
> > >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker:
> > Lost
> > >>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
> > >>> org.apache.hadoop.ipc.RemoteException:
> > >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException:FileSystem
> > >>> object
> > >>> not available yet
> > >>>     at
> > >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(
> JobTracker.java
> > :1475)
> > >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >>>     at
> > >>> sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:39)
> > >>>     at
> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:25)
> > >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> > >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> > >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> > >>>
> > >>>     at org.apache.hadoop.ipc.Client.call(Client.java:482)
> > >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> > >>>     at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown
> > Source)
> > >>>     at
> > >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java
> > :773)
> > >>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java
> :1179)
> > >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> > :1880)
> > >>> *******************************************
> > >>>
> > >>> Please help me to resolve the same.
> > >>>
> > >>>
> > >>> Khalil Honsali wrote:
> > >>>
> > >>>>Hi,
> > >>>>
> > >>>>I think you need to post more information, for example an excerpt of
> > the
> > >>>>failing datanode log. Also, please clarify the issue of
> connectivity:
> > >>>>- are you able to ssh passwordless (from master to slave, slave to
> > master,
> > >>>>slave to slave, master to master), you shouldn't be entering passwrd
> > >>>>everytime...
> > >>>>- are you able to telnet (not necessary but preferred)
> > >>>>- have you verified the ports as RUNNING on using netstat command?
> > >>>>
> > >>>>besides, the tasktracker starts ok but not the datanode?
> > >>>>
> > >>>>K. Honsali
> > >>>>
> > >>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
> > >>>>
> > >>>>>
> > >>>>>I am new to hadoop if any think wrong please correct me ....
> > >>>>>I Have configured a single/multi node cluster using following link
> > >>>>>
> > >>>>>
> >
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> > >>>>>.
> > >>>>>I have followed the link but i am not able to start the haoop in
> > multi
> > >>>>>node
> > >>>>>environment
> > >>>>>The problems i am facing are as Follows:
> > >>>>>1.I have configured master and slave nodes with ssh less pharase if
> > try
> > >>>>>to
> > >>>>>run the start-dfs.sh it prompt the password for master:slave
> > machines.(I
> > >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
> > autherized_key
> > >>>>>file)
> > >>>>>
> > >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker
> > started
> > >>>>>successfully in master but datanode is started in slave.
> > >>>>>
> > >>>>>
> > >>>>>3.Some time step 2 works and some time it says that permission
> > denied.
> > >>>>>
> > >>>>>4.I have checked the log file in the slave for datanode it says
> that
> > >>>>>incompatible node, then i have formated the slave, master and start
> > the
> > >>>>>dfs
> > >>>>>by start-dfs.sh still i am getting the error
> > >>>>>
> > >>>>>
> > >>>>>The host entry in etc/hosts are both master/slave
> > >>>>>master
> > >>>>>slave
> > >>>>>conf/masters
> > >>>>>master
> > >>>>>conf/slaves
> > >>>>>master
> > >>>>>slave
> > >>>>>
> > >>>>>The hadoop-site.xml  for both master/slave
> > >>>>><?xml version="1.0"?>
> > >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> > >>>>>
> > >>>>><!-- Put site-specific property overrides in this file. -->
> > >>>>>
> > >>>>><configuration>
> > >>>>><property>
> > >>>>>  <name>hadoop.tmp.dir</name>
> > >>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
> > >>>>>  <description>A base for other temporary
> directories.</description>
> > >>>>></property>
> > >>>>>
> > >>>>><property>
> > >>>>>  <name>fs.default.name</name>
> > >>>>>  <value>hdfs://master:54310</value>
> > >>>>>  <description>The name of the default file system.  A URI whose
> > >>>>>  scheme and authority determine the FileSystem
> implementation.  The
> > >>>>>  uri's scheme determines the config property (fs.SCHEME.impl)
> naming
> > >>>>>  the FileSystem implementation class.  The uri's authority is used
> > to
> > >>>>>  determine the host, port, etc. for a filesystem.</description>
> > >>>>></property>
> > >>>>>
> > >>>>><property>
> > >>>>>  <name>mapred.job.tracker</name>
> > >>>>>  <value>master:54311</value>
> > >>>>>  <description>The host and port that the MapReduce job tracker
> runs
> > >>>>>  at.  If "local", then jobs are run in-process as a single map
> > >>>>>  and reduce task.
> > >>>>>  </description>
> > >>>>></property>
> > >>>>>
> > >>>>><property>
> > >>>>>  <name>dfs.replication</name>
> > >>>>>  <value>2</value>
> > >>>>>  <description>Default block replication.
> > >>>>>  The actual number of replications can be specified when the file
> is
> > >>>>>created.
> > >>>>>  The default is used if replication is not specified in create
> time.
> > >>>>>  </description>
> > >>>>></property>
> > >>>>>
> > >>>>><property>
> > >>>>>  <name>mapred.map.tasks</name>
> > >>>>>  <value>20</value>
> > >>>>>  <description>As a rule of thumb, use 10x the number of slaves (
> i.e
> > .,
> > >>>>>number of tasktrackers).
> > >>>>>  </description>
> > >>>>></property>
> > >>>>>
> > >>>>><property>
> > >>>>>  <name>mapred.reduce.tasks</name>
> > >>>>>  <value>4</value>
> > >>>>>  <description>As a rule of thumb, use 2x the number of slave
> > >>>>> processors
> > >>>>>(i.e., number of tasktrackers).
> > >>>>>  </description>
> > >>>>></property>
> > >>>>></configuration>
> > >>>>>
> > >>>>>Please help me to reslove the same. Or else provide any other
> > tutorial
> > >>>>>for
> > >>>>>multi node cluster setup.I am egarly waiting for the tutorials.
> > >>>>>
> > >>>>>
> > >>>>>Thanks
> > >>>>>
> > >>>>>--
> > >>>>>View this message in context:
> > >>>>>
> >
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
> > >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
> > >>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >>
> > >
> > >
> >
> > --
> > View this message in context:
> >
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html
> > Sent from the Hadoop Users mailing list archive at Nabble.com.
> >
> >
>



-- 
---------------------------------------------------------
شهر مبارك كريم
كل عام و أنتم بخير
---------------------------------------------------------
Honsali Khalil − 本査理 カリル
Academic>Japan>NIT>Grad. Sc. Eng.>Dept. CS>Matsuo&Tsumura Lab.
http://www.matlab.nitech.ac.jp/~k-hon/
+81 (zero-)eight-zero 5134 8119
k.honsali@ezweb.ne.jp (instant reply mail)

Re: Not able to start Data Node

Posted by Miles Osborne <mi...@inf.ed.ac.uk>.

You need to make sure that each slave node has a copy of the authorised keys
you generated on the master node.

Miles

On 03/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>
>
> Thanks Arun,
>
> I am able to riun the datanode in slave (As per the solution given by You
> (listinig port ))
>
> But still it asks the pasword while starting the dfs ans mapreduce
>
> First i generated rsa as password less as follws
>
> ssh-keygen -t rsa -P ""
> cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
> ssh master
> ssh slave
> I started the dfs in master as follows
> /bin/start-dfs.sh
> it asks the passowrd
> Please help me to resolve the same (I dont know i am doing right in the
> case
> of ssh)
>
>
>
> Dhaya007 wrote:
> >
> >
> >
> > Arun C Murthy wrote:
> >>
> >> What version of Hadoop are you running?
> >> Dhaya007:hadoop-0.15.1
> >>
> >> http://wiki.apache.org/lucene-hadoop/Help
> >>
> >> Dhaya007 wrote:
> >>  > ..datanode-slave.log
> >>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
> >>> directory in dfs.data.dir: directory is not writable:
> >>> /tmp/hadoop-hdpusr/dfs/data
> >>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
> >>> directories in dfs.data.dir are invalid.
> >>
> >> Did you check that directory?
> >> Daya007:Yes, i have checked the folder in which there is no file saved.
> >>
> >> DataNode is complaining that it doesn't have any 'valid' directories to
> >> store data in.
> >>
> >>> Tasktracker_slav.log
> >>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker:
> Can
> >>> not
> >>> start task tracker because java.net.UnknownHostException: unknown
> host:
> >>> localhost
> >>>     at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
> >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >>>     at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown
> Source)
> >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
> >>>     at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
> >>>     at
> >>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
> >>>     at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java
> :717)
> >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> :1880)
> >>>
> >>
> >> That probably means that the TaskTracker's hadoop-site.xml says that
> >> 'localhost' is the JobTracker which isn't true...
> >>
> >> hadoop-site.xml is as follows
> >> <?xml version="1.0"?>
> >> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>
> >> <!-- Put site-specific property overrides in this file. -->
> >>
> >> <configuration>
> >> <property>
> >>   <name>hadoop.tmp.dir</name>
> >>   <value>/home/hdusr/hadoop-${user.name}</value>
> >>   <description>A base for other temporary directories.</description>
> >> </property>
> >>
> >> <property>
> >>   <name>fs.default.name</name>
> >>   <value>hdfs://master:54310</value>
> >>   <description>The name of the default file system.  A URI whose
> >>   scheme and authority determine the FileSystem implementation.  The
> >>   uri's scheme determines the config property (fs.SCHEME.impl) naming
> >>   the FileSystem implementation class.  The uri's authority is used to
> >>   determine the host, port, etc. for a filesystem.</description>
> >> </property>
> >>
> >> <property>
> >>   <name>mapred.job.tracker</name>
> >>   <value>master:54311</value>
> >>   <description>The host and port that the MapReduce job tracker runs
> >>   at.  If "local", then jobs are run in-process as a single map
> >>   and reduce task.
> >>   </description>
> >> </property>
> >>
> >> <property>
> >>   <name>dfs.replication</name>
> >>   <value>2</value>
> >>   <description>Default block replication.
> >>   The actual number of replications can be specified when the file is
> >> created.
> >>   The default is used if replication is not specified in create time.
> >>   </description>
> >> </property>
> >>
> >> <property>
> >>   <name>mapred.map.tasks</name>
> >>   <value>20</value>
> >>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> >> number of tasktrackers).
> >>   </description>
> >> </property>
> >>
> >> <property>
> >>   <name>mapred.reduce.tasks</name>
> >>   <value>4</value>
> >>   <description>As a rule of thumb, use 2x the number of slave
> processors
> >> (i.e., number of tasktrackers).
> >>   </description>
> >> </property>
> >> </configuration>
> >>
> >>  > namenode-master.log
> >>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
> >>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
> >>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping
> >> server
> >>  > on 54310
> >>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
> >>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
> >>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage
> >> directory
> >>  > does not exist or is not accessible.
> >>
> >> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't
> >> accessible.
> >>
> >> Dhaya007 I have checked the name folder but i wont find any folder in
> the
> >> specified dir
> >> -*-*-
> >>
> >> Overall, this looks like an acute case of wrong-configuration-itis.
> >> Please provid the corect configuration site example for multi node
> >> cluster other than
> >>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >> because i followed the same
> >>
> >> Have you got the same hadoop-site.xml on all your nodes?
> >> Dhaya007:Yes
> >>
> >> More info here:
> >> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
> >> Dhaya007: I followed the same site you have mentioned but no solution
> >>
> >> Arun
> >>
> >>
> >>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
> >>> SHUTDOWN_MSG:
> >>> /************************************************************
> >>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
> >>> ************************************************************/
> >>>
> >>>
> >>> And all the ports are running
> >>> Some time it asks password and some time it wont while starting the
> dfs
> >>>
> >>> Master logs
> >>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
> >>> SHUTDOWN_MSG:
> >>> /************************************************************
> >>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
> >>> ************************************************************/
> >>>
> >>> Datanode-master.log
> >>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
> >>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
> >>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> >>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> >>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> >>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> >>> ***********************************************
> >>> Jobtracker_master.log
> >>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
> >>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker:
> >>> problem
> >>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
> >>> java.net.ConnectException: Connection refused
> >>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
> >>>     at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> >>>     at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java
> :195)
> >>>     at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> >>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> >>>     at java.net.Socket.connect(Socket.java:520)
> >>>     at
> >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java
> :152)
> >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:471)
> >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >>>     at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown
> Source)
> >>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> >>>     at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java
> :147)
> >>>     at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
> >>>     at
> >>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(
> DistributedFileSystem.java:65)
> >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
> >>>     at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
> >>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
> >>>     at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
> >>>     at
> >>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
> >>>     at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
> >>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283:
> >>> error:
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>>     at
> >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java
> :1475)
> >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> >>>     at
> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:25)
> >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293:
> >>> error:
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>>     at
> >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java
> :1475)
> >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> >>>     at
> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:25)
> >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> >>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304:
> >>> error:
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>>     at
> >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java
> :1475)
> >>>     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> >>>     at
> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:25)
> >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> >>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> >>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> >>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
> >>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
> >>> SHUTDOWN_MSG:
> >>> /************************************************************
> >>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
> >>> ************************************************************/
> >>>
> >>> Tasktracker_master.log
> >>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
> >>> connect
> >>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
> >>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
> >>> STARTUP_MSG:
> >>> /************************************************************
> >>> STARTUP_MSG: Starting TaskTracker
> >>> STARTUP_MSG:   host = master/172.16.0.25
> >>> STARTUP_MSG:   args = []
> >>> ************************************************************/
> >>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
> >>> Resource
> >>> aliases
> >>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
> >>> Jetty/5.1.4
> >>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
> >>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
> >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >>> WebApplicationContext[/,/]
> >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >>> HttpContext[/logs,/logs]
> >>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> >>> HttpContext[/static,/static]
> >>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
> >>> SocketListener on 0.0.0.0:50060
> >>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
> >>> org.mortbay.jetty.Server@1431340
> >>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> >>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
> >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
> >>> TaskTracker up at: /127.0.0.1:49599
> >>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
> >>> Starting
> >>> tracker tracker_master:/127.0.0.1:49599
> >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> listener on 49599: starting
> >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> handler 0 on 49599: starting
> >>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> >>> handler 1 on 49599: starting
> >>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
> >>> Starting
> >>> thread: Map-events fetcher for all reduce tasks on
> >>> tracker_master:/127.0.0.1:49599
> >>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker:
> Lost
> >>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
> >>> org.apache.hadoop.ipc.RemoteException:
> >>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
> >>> object
> >>> not available yet
> >>>     at
> >>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java
> :1475)
> >>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>>     at
> >>> sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:39)
> >>>     at
> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:25)
> >>>     at java.lang.reflect.Method.invoke(Method.java:585)
> >>>     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> >>>     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> >>>
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:482)
> >>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> >>>     at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown
> Source)
> >>>     at
> >>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java
> :773)
> >>>     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
> >>>     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java
> :1880)
> >>> *******************************************
> >>>
> >>> Please help me to resolve the same.
> >>>
> >>>
> >>> Khalil Honsali wrote:
> >>>
> >>>>Hi,
> >>>>
> >>>>I think you need to post more information, for example an excerpt of
> the
> >>>>failing datanode log. Also, please clarify the issue of connectivity:
> >>>>- are you able to ssh passwordless (from master to slave, slave to
> master,
> >>>>slave to slave, master to master), you shouldn't be entering passwrd
> >>>>everytime...
> >>>>- are you able to telnet (not necessary but preferred)
> >>>>- have you verified the ports as RUNNING on using netstat command?
> >>>>
> >>>>besides, the tasktracker starts ok but not the datanode?
> >>>>
> >>>>K. Honsali
> >>>>
> >>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
> >>>>
> >>>>>
> >>>>>I am new to hadoop if any think wrong please correct me ....
> >>>>>I Have configured a single/multi node cluster using following link
> >>>>>
> >>>>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >>>>>.
> >>>>>I have followed the link but i am not able to start the haoop in
> multi
> >>>>>node
> >>>>>environment
> >>>>>The problems i am facing are as Follows:
> >>>>>1.I have configured master and slave nodes with ssh less pharase if
> try
> >>>>>to
> >>>>>run the start-dfs.sh it prompt the password for master:slave
> machines.(I
> >>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
> autherized_key
> >>>>>file)
> >>>>>
> >>>>>2.After giving password datanode,namenode,jobtracker,tasktraker
> started
> >>>>>successfully in master but datanode is started in slave.
> >>>>>
> >>>>>
> >>>>>3.Some time step 2 works and some time it says that permission
> denied.
> >>>>>
> >>>>>4.I have checked the log file in the slave for datanode it says that
> >>>>>incompatible node, then i have formated the slave, master and start
> the
> >>>>>dfs
> >>>>>by start-dfs.sh still i am getting the error
> >>>>>
> >>>>>
> >>>>>The host entry in etc/hosts are both master/slave
> >>>>>master
> >>>>>slave
> >>>>>conf/masters
> >>>>>master
> >>>>>conf/slaves
> >>>>>master
> >>>>>slave
> >>>>>
> >>>>>The hadoop-site.xml  for both master/slave
> >>>>><?xml version="1.0"?>
> >>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> >>>>>
> >>>>><!-- Put site-specific property overrides in this file. -->
> >>>>>
> >>>>><configuration>
> >>>>><property>
> >>>>>  <name>hadoop.tmp.dir</name>
> >>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
> >>>>>  <description>A base for other temporary directories.</description>
> >>>>></property>
> >>>>>
> >>>>><property>
> >>>>>  <name>fs.default.name</name>
> >>>>>  <value>hdfs://master:54310</value>
> >>>>>  <description>The name of the default file system.  A URI whose
> >>>>>  scheme and authority determine the FileSystem implementation.  The
> >>>>>  uri's scheme determines the config property (fs.SCHEME.impl) naming
> >>>>>  the FileSystem implementation class.  The uri's authority is used
> to
> >>>>>  determine the host, port, etc. for a filesystem.</description>
> >>>>></property>
> >>>>>
> >>>>><property>
> >>>>>  <name>mapred.job.tracker</name>
> >>>>>  <value>master:54311</value>
> >>>>>  <description>The host and port that the MapReduce job tracker runs
> >>>>>  at.  If "local", then jobs are run in-process as a single map
> >>>>>  and reduce task.
> >>>>>  </description>
> >>>>></property>
> >>>>>
> >>>>><property>
> >>>>>  <name>dfs.replication</name>
> >>>>>  <value>2</value>
> >>>>>  <description>Default block replication.
> >>>>>  The actual number of replications can be specified when the file is
> >>>>>created.
> >>>>>  The default is used if replication is not specified in create time.
> >>>>>  </description>
> >>>>></property>
> >>>>>
> >>>>><property>
> >>>>>  <name>mapred.map.tasks</name>
> >>>>>  <value>20</value>
> >>>>>  <description>As a rule of thumb, use 10x the number of slaves (i.e
> .,
> >>>>>number of tasktrackers).
> >>>>>  </description>
> >>>>></property>
> >>>>>
> >>>>><property>
> >>>>>  <name>mapred.reduce.tasks</name>
> >>>>>  <value>4</value>
> >>>>>  <description>As a rule of thumb, use 2x the number of slave
> >>>>> processors
> >>>>>(i.e., number of tasktrackers).
> >>>>>  </description>
> >>>>></property>
> >>>>></configuration>
> >>>>>
> >>>>>Please help me to reslove the same. Or else provide any other
> tutorial
> >>>>>for
> >>>>>multi node cluster setup.I am egarly waiting for the tutorials.
> >>>>>
> >>>>>
> >>>>>Thanks
> >>>>>
> >>>>>--
> >>>>>View this message in context:
> >>>>>
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
> >>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>

Re: Not able to start Data Node

Posted by Dhaya007 <mg...@gmail.com>.

Thanks Arun,

I am able to riun the datanode in slave (As per the solution given by You
(listinig port ))

But still it asks the pasword while starting the dfs ans mapreduce 

First i generated rsa as password less as follws

ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh master
ssh slave
I started the dfs in master as follows
/bin/start-dfs.sh 
it asks the passowrd 
Please help me to resolve the same (I dont know i am doing right in the case
of ssh) 



Dhaya007 wrote:
> 
> 
> 
> Arun C Murthy wrote:
>> 
>> What version of Hadoop are you running?
>> Dhaya007:hadoop-0.15.1
>> 
>> http://wiki.apache.org/lucene-hadoop/Help
>> 
>> Dhaya007 wrote:
>>  > ..datanode-slave.log
>>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
>>> directory in dfs.data.dir: directory is not writable:
>>> /tmp/hadoop-hdpusr/dfs/data
>>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
>>> directories in dfs.data.dir are invalid.
>> 
>> Did you check that directory?
>> Daya007:Yes, i have checked the folder in which there is no file saved.
>> 
>> DataNode is complaining that it doesn't have any 'valid' directories to 
>> store data in.
>> 
>>> Tasktracker_slav.log
>>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>>> not
>>> start task tracker because java.net.UnknownHostException: unknown host:
>>> localhost
>>> 	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
>>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>>> 	at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
>>> 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
>>> 	at
>>> org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
>>> 	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717)
>>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>>> 
>> 
>> That probably means that the TaskTracker's hadoop-site.xml says that 
>> 'localhost' is the JobTracker which isn't true...
>> 
>> hadoop-site.xml is as follows
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> 
>> <!-- Put site-specific property overrides in this file. -->
>> 
>> <configuration>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/home/hdusr/hadoop-${user.name}</value>
>>   <description>A base for other temporary directories.</description>
>> </property>
>>  
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://master:54310</value>
>>   <description>The name of the default file system.  A URI whose
>>   scheme and authority determine the FileSystem implementation.  The
>>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>>   the FileSystem implementation class.  The uri's authority is used to
>>   determine the host, port, etc. for a filesystem.</description>
>> </property>
>>  
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>master:54311</value>
>>   <description>The host and port that the MapReduce job tracker runs
>>   at.  If "local", then jobs are run in-process as a single map
>>   and reduce task.
>>   </description>
>> </property>
>>  
>> <property>
>>   <name>dfs.replication</name>
>>   <value>2</value>
>>   <description>Default block replication.
>>   The actual number of replications can be specified when the file is
>> created.
>>   The default is used if replication is not specified in create time.
>>   </description>
>> </property>
>> 
>> <property>
>>   <name>mapred.map.tasks</name>
>>   <value>20</value>
>>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>> number of tasktrackers).
>>   </description>
>> </property>
>> 
>> <property>
>>   <name>mapred.reduce.tasks</name>
>>   <value>4</value>
>>   <description>As a rule of thumb, use 2x the number of slave processors
>> (i.e., number of tasktrackers).
>>   </description>
>> </property>
>> </configuration>
>> 
>>  > namenode-master.log
>>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
>>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
>>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping 
>> server
>>  > on 54310
>>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
>>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
>>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage 
>> directory
>>  > does not exist or is not accessible.
>> 
>> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't 
>> accessible.
>> 
>> Dhaya007 I have checked the name folder but i wont find any folder in the
>> specified dir
>> -*-*-
>> 
>> Overall, this looks like an acute case of wrong-configuration-itis.
>> Please provid the corect configuration site example for multi node
>> cluster other than 
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>> because i followed the same
>> 
>> Have you got the same hadoop-site.xml on all your nodes?
>> Dhaya007:Yes
>> 
>> More info here: 
>> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
>> Dhaya007: I followed the same site you have mentioned but no solution
>> 
>> Arun
>> 
>> 
>>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
>>> SHUTDOWN_MSG: 
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
>>> ************************************************************/
>>> 
>>> 
>>> And all the ports are running 
>>> Some time it asks password and some time it wont while starting the dfs
>>> 
>>> Master logs
>>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
>>> SHUTDOWN_MSG: 
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
>>> ************************************************************/
>>> 
>>> Datanode-master.log
>>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
>>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
>>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>>> ***********************************************
>>> Jobtracker_master.log
>>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
>>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker:
>>> problem
>>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
>>> java.net.ConnectException: Connection refused
>>> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>> 	at java.net.Socket.connect(Socket.java:520)
>>> 	at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152)
>>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>>> 	at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
>>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>>> 	at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147)
>>> 	at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
>>> 	at
>>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65)
>>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
>>> 	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
>>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
>>> 	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
>>> 	at
>>> org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
>>> 	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
>>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283:
>>> error:
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> 	at
>>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>> 	at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293:
>>> error:
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> 	at
>>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>> 	at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304:
>>> error:
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> 	at
>>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>> 	at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
>>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
>>> SHUTDOWN_MSG: 
>>> /************************************************************
>>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
>>> ************************************************************/
>>> 
>>> Tasktracker_master.log
>>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
>>> connect
>>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
>>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
>>> STARTUP_MSG: 
>>> /************************************************************
>>> STARTUP_MSG: Starting TaskTracker
>>> STARTUP_MSG:   host = master/172.16.0.25
>>> STARTUP_MSG:   args = []
>>> ************************************************************/
>>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
>>> Resource
>>> aliases
>>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
>>> Jetty/5.1.4
>>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
>>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
>>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>>> WebApplicationContext[/,/]
>>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>>> HttpContext[/logs,/logs]
>>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>>> HttpContext[/static,/static]
>>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
>>> SocketListener on 0.0.0.0:50060
>>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
>>> org.mortbay.jetty.Server@1431340
>>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
>>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>>> TaskTracker up at: /127.0.0.1:49599
>>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting
>>> tracker tracker_master:/127.0.0.1:49599
>>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> listener on 49599: starting
>>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 0 on 49599: starting
>>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>>> handler 1 on 49599: starting
>>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
>>> Starting
>>> thread: Map-events fetcher for all reduce tasks on
>>> tracker_master:/127.0.0.1:49599
>>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost
>>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
>>> org.apache.hadoop.ipc.RemoteException:
>>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>>> object
>>> not available yet
>>> 	at
>>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> 	at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> 	at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>>> 
>>> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
>>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>>> 	at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source)
>>> 	at
>>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773)
>>> 	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
>>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>>> *******************************************
>>> 
>>> Please help me to resolve the same.
>>> 
>>> 
>>> Khalil Honsali wrote:
>>> 
>>>>Hi,
>>>>
>>>>I think you need to post more information, for example an excerpt of the
>>>>failing datanode log. Also, please clarify the issue of connectivity:
>>>>- are you able to ssh passwordless (from master to slave, slave to
master,
>>>>slave to slave, master to master), you shouldn't be entering passwrd
>>>>everytime...
>>>>- are you able to telnet (not necessary but preferred)
>>>>- have you verified the ports as RUNNING on using netstat command?
>>>>
>>>>besides, the tasktracker starts ok but not the datanode?
>>>>
>>>>K. Honsali
>>>>
>>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>>>>
>>>>>
>>>>>I am new to hadoop if any think wrong please correct me ....
>>>>>I Have configured a single/multi node cluster using following link
>>>>>
>>>>>http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>>>>>.
>>>>>I have followed the link but i am not able to start the haoop in multi
>>>>>node
>>>>>environment
>>>>>The problems i am facing are as Follows:
>>>>>1.I have configured master and slave nodes with ssh less pharase if try
>>>>>to
>>>>>run the start-dfs.sh it prompt the password for master:slave
machines.(I
>>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
autherized_key
>>>>>file)
>>>>>
>>>>>2.After giving password datanode,namenode,jobtracker,tasktraker started
>>>>>successfully in master but datanode is started in slave.
>>>>>
>>>>>
>>>>>3.Some time step 2 works and some time it says that permission denied.
>>>>>
>>>>>4.I have checked the log file in the slave for datanode it says that
>>>>>incompatible node, then i have formated the slave, master and start the
>>>>>dfs
>>>>>by start-dfs.sh still i am getting the error
>>>>>
>>>>>
>>>>>The host entry in etc/hosts are both master/slave
>>>>>master
>>>>>slave
>>>>>conf/masters
>>>>>master
>>>>>conf/slaves
>>>>>master
>>>>>slave
>>>>>
>>>>>The hadoop-site.xml  for both master/slave
>>>>><?xml version="1.0"?>
>>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>>
>>>>><!-- Put site-specific property overrides in this file. -->
>>>>>
>>>>><configuration>
>>>>><property>
>>>>>  <name>hadoop.tmp.dir</name>
>>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
>>>>>  <description>A base for other temporary directories.</description>
>>>>></property>
>>>>>
>>>>><property>
>>>>>  <name>fs.default.name</name>
>>>>>  <value>hdfs://master:54310</value>
>>>>>  <description>The name of the default file system.  A URI whose
>>>>>  scheme and authority determine the FileSystem implementation.  The
>>>>>  uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>>>  the FileSystem implementation class.  The uri's authority is used to
>>>>>  determine the host, port, etc. for a filesystem.</description>
>>>>></property>
>>>>>
>>>>><property>
>>>>>  <name>mapred.job.tracker</name>
>>>>>  <value>master:54311</value>
>>>>>  <description>The host and port that the MapReduce job tracker runs
>>>>>  at.  If "local", then jobs are run in-process as a single map
>>>>>  and reduce task.
>>>>>  </description>
>>>>></property>
>>>>>
>>>>><property>
>>>>>  <name>dfs.replication</name>
>>>>>  <value>2</value>
>>>>>  <description>Default block replication.
>>>>>  The actual number of replications can be specified when the file is
>>>>>created.
>>>>>  The default is used if replication is not specified in create time.
>>>>>  </description>
>>>>></property>
>>>>>
>>>>><property>
>>>>>  <name>mapred.map.tasks</name>
>>>>>  <value>20</value>
>>>>>  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>>>>>number of tasktrackers).
>>>>>  </description>
>>>>></property>
>>>>>
>>>>><property>
>>>>>  <name>mapred.reduce.tasks</name>
>>>>>  <value>4</value>
>>>>>  <description>As a rule of thumb, use 2x the number of slave
>>>>> processors
>>>>>(i.e., number of tasktrackers).
>>>>>  </description>
>>>>></property>
>>>>></configuration>
>>>>>
>>>>>Please help me to reslove the same. Or else provide any other tutorial
>>>>>for
>>>>>multi node cluster setup.I am egarly waiting for the tutorials.
>>>>>
>>>>>
>>>>>Thanks
>>>>>
>>>>>--
>>>>>View this message in context:
>>>>>http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>
>>>>
>>> 
>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14594256.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Not able to start Data Node

Posted by Dhaya007 <mg...@gmail.com>.



Arun C Murthy wrote:
> 
> What version of Hadoop are you running?
> Dhaya007:hadoop-0.15.1
> 
> http://wiki.apache.org/lucene-hadoop/Help
> 
> Dhaya007 wrote:
>  > ..datanode-slave.log
>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
>> directory in dfs.data.dir: directory is not writable:
>> /tmp/hadoop-hdpusr/dfs/data
>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
>> directories in dfs.data.dir are invalid.
> 
> Did you check that directory?
> Daya007:Yes, i have checked the folder in which there is no file saved.
> 
> DataNode is complaining that it doesn't have any 'valid' directories to 
> store data in.
> 
>> Tasktracker_slav.log
>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not
>> start task tracker because java.net.UnknownHostException: unknown host:
>> localhost
>> 	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
>> 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
>> 	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
>> 	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717)
>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>> 
> 
> That probably means that the TaskTracker's hadoop-site.xml says that 
> 'localhost' is the JobTracker which isn't true...
> 
> hadoop-site.xml is as follows
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
> <!-- Put site-specific property overrides in this file. -->
> 
> <configuration>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hdusr/hadoop-${user.name}</value>
>   <description>A base for other temporary directories.</description>
> </property>
>  
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://master:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>  
> <property>
>   <name>mapred.job.tracker</name>
>   <value>master:54311</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
>  
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> 
> <property>
>   <name>mapred.map.tasks</name>
>   <value>20</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
> 
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>4</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
> 
>  > namenode-master.log
>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping 
> server
>  > on 54310
>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage 
> directory
>  > does not exist or is not accessible.
> 
> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't 
> accessible.
> 
> Dhaya007 I have checked the name folder but i wont find any folder in the
> specified dir
> -*-*-
> 
> Overall, this looks like an acute case of wrong-configuration-itis.
> Please provid the corect configuration site example for multi node cluster
> other than 
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> because i followed the same
> 
> Have you got the same hadoop-site.xml on all your nodes?
> Dhaya007:Yes
> 
> More info here: 
> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
> Dhaya007: I followed the same site you have mentioned but no solution
> 
> Arun
> 
> 
>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
>> ************************************************************/
>> 
>> 
>> And all the ports are running 
>> Some time it asks password and some time it wont while starting the dfs
>> 
>> Master logs
>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
>> ************************************************************/
>> 
>> Datanode-master.log
>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> ***********************************************
>> Jobtracker_master.log
>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: problem
>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
>> java.net.ConnectException: Connection refused
>> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
>> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> 	at java.net.Socket.connect(Socket.java:520)
>> 	at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152)
>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> 	at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147)
>> 	at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
>> 	at
>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65)
>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
>> 	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
>> 	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
>> 	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
>> 	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
>> ************************************************************/
>> 
>> Tasktracker_master.log
>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG: 
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = master/172.16.0.25
>> STARTUP_MSG:   args = []
>> ************************************************************/
>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
>> Resource
>> aliases
>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
>> Jetty/5.1.4
>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> WebApplicationContext[/,/]
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> HttpContext[/logs,/logs]
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> HttpContext[/static,/static]
>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
>> SocketListener on 0.0.0.0:50060
>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
>> org.mortbay.jetty.Server@1431340
>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: /127.0.0.1:49599
>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting
>> tracker tracker_master:/127.0.0.1:49599
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 49599: starting
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 49599: starting
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 49599: starting
>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting
>> thread: Map-events fetcher for all reduce tasks on
>> tracker_master:/127.0.0.1:49599
>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost
>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
>> org.apache.hadoop.ipc.RemoteException:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source)
>> 	at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773)
>> 	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>> *******************************************
>> 
>> Please help me to resolve the same.
>> 
>> 
>> Khalil Honsali wrote:
>> 
>>>Hi,
>>>
>>>I think you need to post more information, for example an excerpt of the
>>>failing datanode log. Also, please clarify the issue of connectivity:
>>>- are you able to ssh passwordless (from master to slave, slave to
master,
>>>slave to slave, master to master), you shouldn't be entering passwrd
>>>everytime...
>>>- are you able to telnet (not necessary but preferred)
>>>- have you verified the ports as RUNNING on using netstat command?
>>>
>>>besides, the tasktracker starts ok but not the datanode?
>>>
>>>K. Honsali
>>>
>>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>>>
>>>>
>>>>I am new to hadoop if any think wrong please correct me ....
>>>>I Have configured a single/multi node cluster using following link
>>>>
>>>>http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>>>>.
>>>>I have followed the link but i am not able to start the haoop in multi
>>>>node
>>>>environment
>>>>The problems i am facing are as Follows:
>>>>1.I have configured master and slave nodes with ssh less pharase if try
>>>>to
>>>>run the start-dfs.sh it prompt the password for master:slave machines.(I
>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
autherized_key
>>>>file)
>>>>
>>>>2.After giving password datanode,namenode,jobtracker,tasktraker started
>>>>successfully in master but datanode is started in slave.
>>>>
>>>>
>>>>3.Some time step 2 works and some time it says that permission denied.
>>>>
>>>>4.I have checked the log file in the slave for datanode it says that
>>>>incompatible node, then i have formated the slave, master and start the
>>>>dfs
>>>>by start-dfs.sh still i am getting the error
>>>>
>>>>
>>>>The host entry in etc/hosts are both master/slave
>>>>master
>>>>slave
>>>>conf/masters
>>>>master
>>>>conf/slaves
>>>>master
>>>>slave
>>>>
>>>>The hadoop-site.xml  for both master/slave
>>>><?xml version="1.0"?>
>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>
>>>><!-- Put site-specific property overrides in this file. -->
>>>>
>>>><configuration>
>>>><property>
>>>>  <name>hadoop.tmp.dir</name>
>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
>>>>  <description>A base for other temporary directories.</description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>fs.default.name</name>
>>>>  <value>hdfs://master:54310</value>
>>>>  <description>The name of the default file system.  A URI whose
>>>>  scheme and authority determine the FileSystem implementation.  The
>>>>  uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>>  the FileSystem implementation class.  The uri's authority is used to
>>>>  determine the host, port, etc. for a filesystem.</description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.job.tracker</name>
>>>>  <value>master:54311</value>
>>>>  <description>The host and port that the MapReduce job tracker runs
>>>>  at.  If "local", then jobs are run in-process as a single map
>>>>  and reduce task.
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>dfs.replication</name>
>>>>  <value>2</value>
>>>>  <description>Default block replication.
>>>>  The actual number of replications can be specified when the file is
>>>>created.
>>>>  The default is used if replication is not specified in create time.
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.map.tasks</name>
>>>>  <value>20</value>
>>>>  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>>>>number of tasktrackers).
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.reduce.tasks</name>
>>>>  <value>4</value>
>>>>  <description>As a rule of thumb, use 2x the number of slave processors
>>>>(i.e., number of tasktrackers).
>>>>  </description>
>>>></property>
>>>></configuration>
>>>>
>>>>Please help me to reslove the same. Or else provide any other tutorial
>>>>for
>>>>multi node cluster setup.I am egarly waiting for the tutorials.
>>>>
>>>>
>>>>Thanks
>>>>
>>>>--
>>>>View this message in context:
>>>>http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14577669.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Not able to start Data Node

Posted by Arun C Murthy <ar...@yahoo-inc.com>.

What version of Hadoop are you running?

http://wiki.apache.org/lucene-hadoop/Help

Dhaya007 wrote:
 > ..datanode-slave.log
> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
> directory in dfs.data.dir: directory is not writable:
> /tmp/hadoop-hdpusr/dfs/data
> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
> directories in dfs.data.dir are invalid.

Did you check that directory?

DataNode is complaining that it doesn't have any 'valid' directories to 
store data in.

> Tasktracker_slav.log
> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
> start task tracker because java.net.UnknownHostException: unknown host:
> localhost
> 	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> 	at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
> 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
> 	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
> 	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717)
> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
> 

That probably means that the TaskTracker's hadoop-site.xml says that 
'localhost' is the JobTracker which isn't true...

 > namenode-master.log
 > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
 > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
 > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping 
server
 > on 54310
 > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
 > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
 > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage 
directory
 > does not exist or is not accessible.

That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't 
accessible.

-*-*-

Overall, this looks like an acute case of wrong-configuration-itis.

Have you got the same hadoop-site.xml on all your nodes?

More info here: 
http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html

Arun


> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
> SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
> ************************************************************/
> 
> 
> And all the ports are running 
> Some time it asks password and some time it wont while starting the dfs
> 
> Master logs
> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode: SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
> ************************************************************/
> 
> Datanode-master.log
> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
> localhost/127.0.0.1:54310 not available yet, Zzzzz...
> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> ***********************************************
> Jobtracker_master.log
> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: problem
> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
> java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> 	at java.net.Socket.connect(Socket.java:520)
> 	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152)
> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> 	at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
> 	at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147)
> 	at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
> 	at
> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
> 	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
> 	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
> 	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
> 	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: error:
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> 	at
> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> 	at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: error:
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> 	at
> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> 	at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: error:
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> 	at
> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> 	at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
> SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
> ************************************************************/
> 
> Tasktracker_master.log
> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
> STARTUP_MSG: 
> /************************************************************
> STARTUP_MSG: Starting TaskTracker
> STARTUP_MSG:   host = master/172.16.0.25
> STARTUP_MSG:   args = []
> ************************************************************/
> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking Resource
> aliases
> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
> Jetty/5.1.4
> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> WebApplicationContext[/,/]
> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> HttpContext[/logs,/logs]
> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
> HttpContext[/static,/static]
> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
> SocketListener on 0.0.0.0:50060
> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
> org.mortbay.jetty.Server@1431340
> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=TaskTracker, sessionId=
> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
> TaskTracker up at: /127.0.0.1:49599
> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> tracker tracker_master:/127.0.0.1:49599
> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> listener on 49599: starting
> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 0 on 49599: starting
> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
> handler 1 on 49599: starting
> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker: Starting
> thread: Map-events fetcher for all reduce tasks on
> tracker_master:/127.0.0.1:49599
> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost
> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
> not available yet
> 	at
> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> 
> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> 	at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source)
> 	at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773)
> 	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
> *******************************************
> 
> Please help me to resolve the same.
> 
> 
> Khalil Honsali wrote:
> 
>>Hi,
>>
>>I think you need to post more information, for example an excerpt of the
>>failing datanode log. Also, please clarify the issue of connectivity:
>>- are you able to ssh passwordless (from master to slave, slave to master,
>>slave to slave, master to master), you shouldn't be entering passwrd
>>everytime...
>>- are you able to telnet (not necessary but preferred)
>>- have you verified the ports as RUNNING on using netstat command?
>>
>>besides, the tasktracker starts ok but not the datanode?
>>
>>K. Honsali
>>
>>On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>>
>>>
>>>I am new to hadoop if any think wrong please correct me ....
>>>I Have configured a single/multi node cluster using following link
>>>
>>>http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>>>.
>>>I have followed the link but i am not able to start the haoop in multi
>>>node
>>>environment
>>>The problems i am facing are as Follows:
>>>1.I have configured master and slave nodes with ssh less pharase if try
>>>to
>>>run the start-dfs.sh it prompt the password for master:slave machines.(I
>>>have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key
>>>file)
>>>
>>>2.After giving password datanode,namenode,jobtracker,tasktraker started
>>>successfully in master but datanode is started in slave.
>>>
>>>
>>>3.Some time step 2 works and some time it says that permission denied.
>>>
>>>4.I have checked the log file in the slave for datanode it says that
>>>incompatible node, then i have formated the slave, master and start the
>>>dfs
>>>by start-dfs.sh still i am getting the error
>>>
>>>
>>>The host entry in etc/hosts are both master/slave
>>>master
>>>slave
>>>conf/masters
>>>master
>>>conf/slaves
>>>master
>>>slave
>>>
>>>The hadoop-site.xml  for both master/slave
>>><?xml version="1.0"?>
>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>
>>><!-- Put site-specific property overrides in this file. -->
>>>
>>><configuration>
>>><property>
>>>  <name>hadoop.tmp.dir</name>
>>>  <value>/home/hdusr/hadoop-${user.name}</value>
>>>  <description>A base for other temporary directories.</description>
>>></property>
>>>
>>><property>
>>>  <name>fs.default.name</name>
>>>  <value>hdfs://master:54310</value>
>>>  <description>The name of the default file system.  A URI whose
>>>  scheme and authority determine the FileSystem implementation.  The
>>>  uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>  the FileSystem implementation class.  The uri's authority is used to
>>>  determine the host, port, etc. for a filesystem.</description>
>>></property>
>>>
>>><property>
>>>  <name>mapred.job.tracker</name>
>>>  <value>master:54311</value>
>>>  <description>The host and port that the MapReduce job tracker runs
>>>  at.  If "local", then jobs are run in-process as a single map
>>>  and reduce task.
>>>  </description>
>>></property>
>>>
>>><property>
>>>  <name>dfs.replication</name>
>>>  <value>2</value>
>>>  <description>Default block replication.
>>>  The actual number of replications can be specified when the file is
>>>created.
>>>  The default is used if replication is not specified in create time.
>>>  </description>
>>></property>
>>>
>>><property>
>>>  <name>mapred.map.tasks</name>
>>>  <value>20</value>
>>>  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>>>number of tasktrackers).
>>>  </description>
>>></property>
>>>
>>><property>
>>>  <name>mapred.reduce.tasks</name>
>>>  <value>4</value>
>>>  <description>As a rule of thumb, use 2x the number of slave processors
>>>(i.e., number of tasktrackers).
>>>  </description>
>>></property>
>>></configuration>
>>>
>>>Please help me to reslove the same. Or else provide any other tutorial
>>>for
>>>multi node cluster setup.I am egarly waiting for the tutorials.
>>>
>>>
>>>Thanks
>>>
>>>--
>>>View this message in context:
>>>http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>
>>>
>>
>>
>

Re: Not able to start Data Node

Posted by Dhaya007 <mg...@gmail.com>.

Thanks for your reply i am using password less ssh master to slave
and following are the logs (slave)
..datanode-slave.log
2007-12-19 19:30:55,237 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = slave/172.16.0.58
STARTUP_MSG:   args = []
************************************************************/
2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
directory in dfs.data.dir: directory is not writable:
/tmp/hadoop-hdpusr/dfs/data
2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
directories in dfs.data.dir are invalid.
2007-12-19 19:30:55,582 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.58
************************************************************/

Tasktracker_slav.log
2008-01-02 15:10:10,634 INFO org.apache.hadoop.mapred.TaskTracker:
STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = slave/172.16.0.58
STARTUP_MSG:   args = []
************************************************************/
2008-01-02 15:10:32,024 INFO org.mortbay.util.Credential: Checking Resource
aliases
2008-01-02 15:10:32,368 INFO org.mortbay.http.HttpServer: Version
Jetty/5.1.4
2008-01-02 15:10:33,853 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.servlet.WebApplicationHandler@1ce784b
2008-01-02 15:10:34,039 INFO org.mortbay.util.Container: Started
WebApplicationContext[/,/]
2008-01-02 15:10:34,039 INFO org.mortbay.util.Container: Started
HttpContext[/logs,/logs]
2008-01-02 15:10:34,040 INFO org.mortbay.util.Container: Started
HttpContext[/static,/static]
2008-01-02 15:10:34,052 INFO org.mortbay.http.SocketListener: Started
SocketListener on 0.0.0.0:50060
2008-01-02 15:10:34,052 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.Server@1827284
2008-01-02 15:10:34,101 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=TaskTracker, sessionId=
2008-01-02 15:10:34,235 INFO org.apache.hadoop.mapred.TaskTracker:
TaskTracker up at: /127.0.0.1:32772
2008-01-02 15:10:34,235 INFO org.apache.hadoop.mapred.TaskTracker: Starting
tracker tracker_slave:/127.0.0.1:32772
2008-01-02 15:10:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 32772: starting
2008-01-02 15:10:34,247 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 32772: starting
2008-01-02 15:10:34,248 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 32772: starting
2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because java.net.UnknownHostException: unknown host:
localhost
	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
	at org.apache.hadoop.ipc.Client.call(Client.java:471)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
	at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717)
	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)

2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
************************************************************/


And all the ports are running 
Some time it asks password and some time it wont while starting the dfs

Master logs
namenode-master.log
2008-01-02 14:44:01,017 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/172.16.0.25
STARTUP_MSG:   args = []
************************************************************/
2008-01-02 14:44:02,453 INFO org.apache.hadoop.dfs.NameNode: Namenode up at:
localhost.localdomain/127.0.0.1:54310
2008-01-02 14:44:02,458 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=NameNode, sessionId=null
2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
directory /tmp/hadoop-hdpusr/dfs/name does not exist.
2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping server
on 54310
2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
org.apache.hadoop.dfs.InconsistentFSStateException: Directory
/tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage directory
does not exist or is not accessible.
	at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:153)
	at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:76)
	at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:221)
	at org.apache.hadoop.dfs.NameNode.init(NameNode.java:130)
	at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:168)
	at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:804)
	at org.apache.hadoop.dfs.NameNode.main(NameNode.java:813)

2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
************************************************************/

Datanode-master.log
2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
localhost/127.0.0.1:54310 not available yet, Zzzzz...
2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
***********************************************
Jobtracker_master.log
2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: problem
cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:520)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
	at org.apache.hadoop.ipc.Client.call(Client.java:471)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
	at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
	at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147)
	at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
	at
org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: error:
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
	at
org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: error:
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
	at
org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: error:
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
	at
org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
************************************************************/

Tasktracker_master.log
2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying connect
to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = master/172.16.0.25
STARTUP_MSG:   args = []
************************************************************/
2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking Resource
aliases
2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
Jetty/5.1.4
2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
WebApplicationContext[/,/]
2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
HttpContext[/logs,/logs]
2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
HttpContext[/static,/static]
2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
SocketListener on 0.0.0.0:50060
2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
org.mortbay.jetty.Server@1431340
2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
Initializing JVM Metrics with processName=TaskTracker, sessionId=
2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
TaskTracker up at: /127.0.0.1:49599
2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: Starting
tracker tracker_master:/127.0.0.1:49599
2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
listener on 49599: starting
2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 0 on 49599: starting
2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 49599: starting
2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker: Starting
thread: Map-events fetcher for all reduce tasks on
tracker_master:/127.0.0.1:49599
2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost
connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object
not available yet
	at
org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

	at org.apache.hadoop.ipc.Client.call(Client.java:482)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
	at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source)
	at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773)
	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
*******************************************

Please help me to resolve the same.


Khalil Honsali wrote:
> 
> Hi,
> 
> I think you need to post more information, for example an excerpt of the
> failing datanode log. Also, please clarify the issue of connectivity:
> - are you able to ssh passwordless (from master to slave, slave to master,
> slave to slave, master to master), you shouldn't be entering passwrd
> everytime...
> - are you able to telnet (not necessary but preferred)
> - have you verified the ports as RUNNING on using netstat command?
> 
> besides, the tasktracker starts ok but not the datanode?
> 
> K. Honsali
> 
> On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>>
>>
>> I am new to hadoop if any think wrong please correct me ....
>> I Have configured a single/multi node cluster using following link
>>
>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>> .
>> I have followed the link but i am not able to start the haoop in multi
>> node
>> environment
>> The problems i am facing are as Follows:
>> 1.I have configured master and slave nodes with ssh less pharase if try
>> to
>> run the start-dfs.sh it prompt the password for master:slave machines.(I
>> have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key
>> file)
>>
>> 2.After giving password datanode,namenode,jobtracker,tasktraker started
>> successfully in master but datanode is started in slave.
>>
>>
>> 3.Some time step 2 works and some time it says that permission denied.
>>
>> 4.I have checked the log file in the slave for datanode it says that
>> incompatible node, then i have formated the slave, master and start the
>> dfs
>> by start-dfs.sh still i am getting the error
>>
>>
>> The host entry in etc/hosts are both master/slave
>> master
>> slave
>> conf/masters
>> master
>> conf/slaves
>> master
>> slave
>>
>> The hadoop-site.xml  for both master/slave
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <!-- Put site-specific property overrides in this file. -->
>>
>> <configuration>
>> <property>
>>   <name>hadoop.tmp.dir</name>
>>   <value>/home/hdusr/hadoop-${user.name}</value>
>>   <description>A base for other temporary directories.</description>
>> </property>
>>
>> <property>
>>   <name>fs.default.name</name>
>>   <value>hdfs://master:54310</value>
>>   <description>The name of the default file system.  A URI whose
>>   scheme and authority determine the FileSystem implementation.  The
>>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>>   the FileSystem implementation class.  The uri's authority is used to
>>   determine the host, port, etc. for a filesystem.</description>
>> </property>
>>
>> <property>
>>   <name>mapred.job.tracker</name>
>>   <value>master:54311</value>
>>   <description>The host and port that the MapReduce job tracker runs
>>   at.  If "local", then jobs are run in-process as a single map
>>   and reduce task.
>>   </description>
>> </property>
>>
>> <property>
>>   <name>dfs.replication</name>
>>   <value>2</value>
>>   <description>Default block replication.
>>   The actual number of replications can be specified when the file is
>> created.
>>   The default is used if replication is not specified in create time.
>>   </description>
>> </property>
>>
>> <property>
>>   <name>mapred.map.tasks</name>
>>   <value>20</value>
>>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>> number of tasktrackers).
>>   </description>
>> </property>
>>
>> <property>
>>   <name>mapred.reduce.tasks</name>
>>   <value>4</value>
>>   <description>As a rule of thumb, use 2x the number of slave processors
>> (i.e., number of tasktrackers).
>>   </description>
>> </property>
>> </configuration>
>>
>> Please help me to reslove the same. Or else provide any other tutorial
>> for
>> multi node cluster setup.I am egarly waiting for the tutorials.
>>
>>
>> Thanks
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>> Sent from the Hadoop Users mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14576700.html
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: Not able to start Data Node

Posted by Khalil Honsali <k....@gmail.com>.

Hi,

I think you need to post more information, for example an excerpt of the
failing datanode log. Also, please clarify the issue of connectivity:
- are you able to ssh passwordless (from master to slave, slave to master,
slave to slave, master to master), you shouldn't be entering passwrd
everytime...
- are you able to telnet (not necessary but preferred)
- have you verified the ports as RUNNING on using netstat command?

besides, the tasktracker starts ok but not the datanode?

K. Honsali

On 02/01/2008, Dhaya007 <mg...@gmail.com> wrote:
>
>
> I am new to hadoop if any think wrong please correct me ....
> I Have configured a single/multi node cluster using following link
>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> .
> I have followed the link but i am not able to start the haoop in multi
> node
> environment
> The problems i am facing are as Follows:
> 1.I have configured master and slave nodes with ssh less pharase if try to
> run the start-dfs.sh it prompt the password for master:slave machines.(I
> have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key
> file)
>
> 2.After giving password datanode,namenode,jobtracker,tasktraker started
> successfully in master but datanode is started in slave.
>
>
> 3.Some time step 2 works and some time it says that permission denied.
>
> 4.I have checked the log file in the slave for datanode it says that
> incompatible node, then i have formated the slave, master and start the
> dfs
> by start-dfs.sh still i am getting the error
>
>
> The host entry in etc/hosts are both master/slave
> master
> slave
> conf/masters
> master
> conf/slaves
> master
> slave
>
> The hadoop-site.xml  for both master/slave
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hdusr/hadoop-${user.name}</value>
>   <description>A base for other temporary directories.</description>
> </property>
>
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://master:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>master:54311</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
>
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
>
> <property>
>   <name>mapred.map.tasks</name>
>   <value>20</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
>
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>4</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
>
> Please help me to reslove the same. Or else provide any other tutorial for
> multi node cluster setup.I am egarly waiting for the tutorials.
>
>
> Thanks
>
> --
> View this message in context:
> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>