You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by bikash sharma <sh...@gmail.com> on 2011/02/26 16:25:53 UTC

TaskTracker not starting on all nodes

Hi,
I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
experiments.
Surprisingly, when I initialize the Hadoop cluster
(hadoop/bin/start-mapred.sh), in many instances, only some nodes have
TaskTracker process up (seen using jps), while other nodes do not have
TaskTrackers. Could anyone please explain?

Thanks,
Bikash

Re: TaskTracker not starting on all nodes

Posted by Allen Wittenauer <aw...@apache.org>.

(Removing common-dev, because this isn't a dev question)

On Feb 26, 2011, at 7:25 AM, bikash sharma wrote:

> Hi,
> I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
> experiments.
> Surprisingly, when I initialize the Hadoop cluster
> (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> TaskTracker process up (seen using jps), while other nodes do not have
> TaskTrackers. Could anyone please explain?

	Check your logs.

Re: TaskTracker not starting on all nodes

Posted by Allen Wittenauer <aw...@apache.org>.

(Removing common-dev, because this isn't a dev question)

On Feb 26, 2011, at 7:25 AM, bikash sharma wrote:

> Hi,
> I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
> experiments.
> Surprisingly, when I initialize the Hadoop cluster
> (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> TaskTracker process up (seen using jps), while other nodes do not have
> TaskTrackers. Could anyone please explain?

	Check your logs.

Re: TaskTracker not starting on all nodes

Posted by icebergs <hk...@gmail.com>.

put your slave's public key into your master's authorized_keys.

cat public_key >> authorized_keys

2011/3/4 MANISH SINGLA <co...@gmail.com>

> Hii all,
> I am trying to setup a 2 node cluster...I have configured all the
> files as specified in the tutorial I am refering to...I copied the
> public key to the slave's machine...but when I ssh to the slave from
> the master, it asks for password everytime...kindly help...
>
> On Fri, Mar 4, 2011 at 11:12 AM, icebergs <hk...@gmail.com> wrote:
> > You can check the logs whose tasktracker isn't up.
> > The path is "HADOOP_HOME/logs/".
> > The answer may be in it.
> >
> > 2011/3/2 bikash sharma <sh...@gmail.com>
> >
> >> Hi Sonal,
> >> Thanks. I guess you are right. ps -ef exposes such processes.
> >>
> >> -bikash
> >>
> >> On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <so...@gmail.com>
> wrote:
> >>
> >> > Bikash,
> >> >
> >> > I have sometimes found hanging processes which jps does not report,
> but a
> >> > ps -ef shows them. Maybe you can check this on the errant nodes..
> >> >
> >> > Thanks and Regards,
> >> > Sonal
> >> > <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<
> >> https://github.com/sonalgoyal/hiho>
> >> > Nube Technologies <http://www.nubetech.co>
> >> >
> >> > <http://in.linkedin.com/in/sonalgoyal>
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <
> sharmabiks.07@gmail.com
> >> >wrote:
> >> >
> >> >> Hi James,
> >> >> Sorry for the late response. No, the same problem persists. I
> >> reformatted
> >> >> HDFS, stopped mapred and hdfs daemons and restarted them (using
> >> >> start-dfs.sh
> >> >> and start-mapred.sh from master node). But surprisingly out of 4
> nodes
> >> >> cluster, two nodes have TaskTracker running while other two do not
> have
> >> >> TaskTrackers on them (verified using jps). I guess since I have the
> >> Hadoop
> >> >> installed on shared storage, that might be the issue? Btw, how do I
> >> start
> >> >> the services independently on each node?
> >> >>
> >> >> -bikash
> >> >> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com>
> wrote:
> >> >>
> >> >> > .... Did you get it working?  What was the fix?
> >> >> >
> >> >> > Sent from my mobile. Please excuse the typos.
> >> >> >
> >> >> > On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
> >> >> >
> >> >> > > Hey Bikash,
> >> >> > >
> >> >> > > Maybe you can manually start a  tasktracker on the node and see
> if
> >> >> there
> >> >> > are
> >> >> > > any error messages. Also, don't forget to check your configure
> files
> >> >> for
> >> >> > > mapreduce and hdfs and make sure datanode can start successfully
> >> >> first.
> >> >> > > After all these steps, you can submit a job on the master node
> and
> >> see
> >> >> if
> >> >> > > there are any communication between these failed nodes and the
> >> master
> >> >> > node.
> >> >> > > Post your error messages here if possible.
> >> >> > >
> >> >> > > HTH.
> >> >> > > Simon -
> >> >> > >
> >> >> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
> >> >> sharmabiks.07@gmail.com
> >> >> > >wrote:
> >> >> > >
> >> >> > >> Thanks James. Well all the config. files and shared keys are on
> a
> >> >> shared
> >> >> > >> storage that is accessed by all the nodes in the cluster.
> >> >> > >> At times, everything runs fine on initialization, but at other
> >> times,
> >> >> > the
> >> >> > >> same problem persists, so was bit confused.
> >> >> > >> Also, checked the TaskTracker logs on those nodes, there does
> not
> >> >> seem
> >> >> > to
> >> >> > >> be
> >> >> > >> any error.
> >> >> > >>
> >> >> > >> -bikash
> >> >> > >>
> >> >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com>
> >> >> wrote:
> >> >> > >>
> >> >> > >>> Maybe your ssh keys aren’t distributed the same on each machine
> or
> >> >> the
> >> >> > >>> machines aren’t configured the same?
> >> >> > >>>
> >> >> > >>> J
> >> >> > >>>
> >> >> > >>>
> >> >> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
> >> >> > >>>
> >> >> > >>>> Hi,
> >> >> > >>>> I have a 10 nodes Hadoop cluster, where I am running some
> >> >> benchmarks
> >> >> > >> for
> >> >> > >>>> experiments.
> >> >> > >>>> Surprisingly, when I initialize the Hadoop cluster
> >> >> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some
> nodes
> >> >> have
> >> >> > >>>> TaskTracker process up (seen using jps), while other nodes do
> not
> >> >> have
> >> >> > >>>> TaskTrackers. Could anyone please explain?
> >> >> > >>>>
> >> >> > >>>> Thanks,
> >> >> > >>>> Bikash
> >> >> > >>>
> >> >> > >>>
> >> >> > >>
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > Regards,
> >> >> > > Simon
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
>

Re: TaskTracker not starting on all nodes

Posted by James Seigel <ja...@tynt.com>.

Sounds like just a bit more work on understanding ssh will get you there.

What you are looking for is getting that public key into authorized_keys

James

Sent from my mobile. Please excuse the typos.

On 2011-03-04, at 2:58 AM, MANISH SINGLA <co...@gmail.com> wrote:

> Hii all,
> I am trying to setup a 2 node cluster...I have configured all the
> files as specified in the tutorial I am refering to...I copied the
> public key to the slave's machine...but when I ssh to the slave from
> the master, it asks for password everytime...kindly help...
>
> On Fri, Mar 4, 2011 at 11:12 AM, icebergs <hk...@gmail.com> wrote:
>> You can check the logs whose tasktracker isn't up.
>> The path is "HADOOP_HOME/logs/".
>> The answer may be in it.
>>
>> 2011/3/2 bikash sharma <sh...@gmail.com>
>>
>>> Hi Sonal,
>>> Thanks. I guess you are right. ps -ef exposes such processes.
>>>
>>> -bikash
>>>
>>> On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <so...@gmail.com> wrote:
>>>
>>>> Bikash,
>>>>
>>>> I have sometimes found hanging processes which jps does not report, but a
>>>> ps -ef shows them. Maybe you can check this on the errant nodes..
>>>>
>>>> Thanks and Regards,
>>>> Sonal
>>>> <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<
>>> https://github.com/sonalgoyal/hiho>
>>>> Nube Technologies <http://www.nubetech.co>
>>>>
>>>> <http://in.linkedin.com/in/sonalgoyal>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <sharmabiks.07@gmail.com
>>>> wrote:
>>>>
>>>>> Hi James,
>>>>> Sorry for the late response. No, the same problem persists. I
>>> reformatted
>>>>> HDFS, stopped mapred and hdfs daemons and restarted them (using
>>>>> start-dfs.sh
>>>>> and start-mapred.sh from master node). But surprisingly out of 4 nodes
>>>>> cluster, two nodes have TaskTracker running while other two do not have
>>>>> TaskTrackers on them (verified using jps). I guess since I have the
>>> Hadoop
>>>>> installed on shared storage, that might be the issue? Btw, how do I
>>> start
>>>>> the services independently on each node?
>>>>>
>>>>> -bikash
>>>>> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com> wrote:
>>>>>
>>>>>> .... Did you get it working?  What was the fix?
>>>>>>
>>>>>> Sent from my mobile. Please excuse the typos.
>>>>>>
>>>>>> On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
>>>>>>
>>>>>>> Hey Bikash,
>>>>>>>
>>>>>>> Maybe you can manually start a  tasktracker on the node and see if
>>>>> there
>>>>>> are
>>>>>>> any error messages. Also, don't forget to check your configure files
>>>>> for
>>>>>>> mapreduce and hdfs and make sure datanode can start successfully
>>>>> first.
>>>>>>> After all these steps, you can submit a job on the master node and
>>> see
>>>>> if
>>>>>>> there are any communication between these failed nodes and the
>>> master
>>>>>> node.
>>>>>>> Post your error messages here if possible.
>>>>>>>
>>>>>>> HTH.
>>>>>>> Simon -
>>>>>>>
>>>>>>> On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
>>>>> sharmabiks.07@gmail.com
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks James. Well all the config. files and shared keys are on a
>>>>> shared
>>>>>>>> storage that is accessed by all the nodes in the cluster.
>>>>>>>> At times, everything runs fine on initialization, but at other
>>> times,
>>>>>> the
>>>>>>>> same problem persists, so was bit confused.
>>>>>>>> Also, checked the TaskTracker logs on those nodes, there does not
>>>>> seem
>>>>>> to
>>>>>>>> be
>>>>>>>> any error.
>>>>>>>>
>>>>>>>> -bikash
>>>>>>>>
>>>>>>>> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com>
>>>>> wrote:
>>>>>>>>
>>>>>>>>> Maybe your ssh keys aren’t distributed the same on each machine or
>>>>> the
>>>>>>>>> machines aren’t configured the same?
>>>>>>>>>
>>>>>>>>> J
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> I have a 10 nodes Hadoop cluster, where I am running some
>>>>> benchmarks
>>>>>>>> for
>>>>>>>>>> experiments.
>>>>>>>>>> Surprisingly, when I initialize the Hadoop cluster
>>>>>>>>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes
>>>>> have
>>>>>>>>>> TaskTracker process up (seen using jps), while other nodes do not
>>>>> have
>>>>>>>>>> TaskTrackers. Could anyone please explain?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Bikash
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Simon
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>

Re: TaskTracker not starting on all nodes

Posted by MANISH SINGLA <co...@gmail.com>.

Hii all,
I am trying to setup a 2 node cluster...I have configured all the
files as specified in the tutorial I am refering to...I copied the
public key to the slave's machine...but when I ssh to the slave from
the master, it asks for password everytime...kindly help...

On Fri, Mar 4, 2011 at 11:12 AM, icebergs <hk...@gmail.com> wrote:
> You can check the logs whose tasktracker isn't up.
> The path is "HADOOP_HOME/logs/".
> The answer may be in it.
>
> 2011/3/2 bikash sharma <sh...@gmail.com>
>
>> Hi Sonal,
>> Thanks. I guess you are right. ps -ef exposes such processes.
>>
>> -bikash
>>
>> On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <so...@gmail.com> wrote:
>>
>> > Bikash,
>> >
>> > I have sometimes found hanging processes which jps does not report, but a
>> > ps -ef shows them. Maybe you can check this on the errant nodes..
>> >
>> > Thanks and Regards,
>> > Sonal
>> > <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<
>> https://github.com/sonalgoyal/hiho>
>> > Nube Technologies <http://www.nubetech.co>
>> >
>> > <http://in.linkedin.com/in/sonalgoyal>
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <sharmabiks.07@gmail.com
>> >wrote:
>> >
>> >> Hi James,
>> >> Sorry for the late response. No, the same problem persists. I
>> reformatted
>> >> HDFS, stopped mapred and hdfs daemons and restarted them (using
>> >> start-dfs.sh
>> >> and start-mapred.sh from master node). But surprisingly out of 4 nodes
>> >> cluster, two nodes have TaskTracker running while other two do not have
>> >> TaskTrackers on them (verified using jps). I guess since I have the
>> Hadoop
>> >> installed on shared storage, that might be the issue? Btw, how do I
>> start
>> >> the services independently on each node?
>> >>
>> >> -bikash
>> >> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com> wrote:
>> >>
>> >> > .... Did you get it working?  What was the fix?
>> >> >
>> >> > Sent from my mobile. Please excuse the typos.
>> >> >
>> >> > On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
>> >> >
>> >> > > Hey Bikash,
>> >> > >
>> >> > > Maybe you can manually start a  tasktracker on the node and see if
>> >> there
>> >> > are
>> >> > > any error messages. Also, don't forget to check your configure files
>> >> for
>> >> > > mapreduce and hdfs and make sure datanode can start successfully
>> >> first.
>> >> > > After all these steps, you can submit a job on the master node and
>> see
>> >> if
>> >> > > there are any communication between these failed nodes and the
>> master
>> >> > node.
>> >> > > Post your error messages here if possible.
>> >> > >
>> >> > > HTH.
>> >> > > Simon -
>> >> > >
>> >> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
>> >> sharmabiks.07@gmail.com
>> >> > >wrote:
>> >> > >
>> >> > >> Thanks James. Well all the config. files and shared keys are on a
>> >> shared
>> >> > >> storage that is accessed by all the nodes in the cluster.
>> >> > >> At times, everything runs fine on initialization, but at other
>> times,
>> >> > the
>> >> > >> same problem persists, so was bit confused.
>> >> > >> Also, checked the TaskTracker logs on those nodes, there does not
>> >> seem
>> >> > to
>> >> > >> be
>> >> > >> any error.
>> >> > >>
>> >> > >> -bikash
>> >> > >>
>> >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com>
>> >> wrote:
>> >> > >>
>> >> > >>> Maybe your ssh keys aren’t distributed the same on each machine or
>> >> the
>> >> > >>> machines aren’t configured the same?
>> >> > >>>
>> >> > >>> J
>> >> > >>>
>> >> > >>>
>> >> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
>> >> > >>>
>> >> > >>>> Hi,
>> >> > >>>> I have a 10 nodes Hadoop cluster, where I am running some
>> >> benchmarks
>> >> > >> for
>> >> > >>>> experiments.
>> >> > >>>> Surprisingly, when I initialize the Hadoop cluster
>> >> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes
>> >> have
>> >> > >>>> TaskTracker process up (seen using jps), while other nodes do not
>> >> have
>> >> > >>>> TaskTrackers. Could anyone please explain?
>> >> > >>>>
>> >> > >>>> Thanks,
>> >> > >>>> Bikash
>> >> > >>>
>> >> > >>>
>> >> > >>
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Regards,
>> >> > > Simon
>> >> >
>> >>
>> >
>> >
>>
>

Re: TaskTracker not starting on all nodes

Posted by icebergs <hk...@gmail.com>.

You can check the logs whose tasktracker isn't up.
The path is "HADOOP_HOME/logs/".
The answer may be in it.

2011/3/2 bikash sharma <sh...@gmail.com>

> Hi Sonal,
> Thanks. I guess you are right. ps -ef exposes such processes.
>
> -bikash
>
> On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <so...@gmail.com> wrote:
>
> > Bikash,
> >
> > I have sometimes found hanging processes which jps does not report, but a
> > ps -ef shows them. Maybe you can check this on the errant nodes..
> >
> > Thanks and Regards,
> > Sonal
> > <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<
> https://github.com/sonalgoyal/hiho>
> > Nube Technologies <http://www.nubetech.co>
> >
> > <http://in.linkedin.com/in/sonalgoyal>
> >
> >
> >
> >
> >
> >
> > On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <sharmabiks.07@gmail.com
> >wrote:
> >
> >> Hi James,
> >> Sorry for the late response. No, the same problem persists. I
> reformatted
> >> HDFS, stopped mapred and hdfs daemons and restarted them (using
> >> start-dfs.sh
> >> and start-mapred.sh from master node). But surprisingly out of 4 nodes
> >> cluster, two nodes have TaskTracker running while other two do not have
> >> TaskTrackers on them (verified using jps). I guess since I have the
> Hadoop
> >> installed on shared storage, that might be the issue? Btw, how do I
> start
> >> the services independently on each node?
> >>
> >> -bikash
> >> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com> wrote:
> >>
> >> > .... Did you get it working?  What was the fix?
> >> >
> >> > Sent from my mobile. Please excuse the typos.
> >> >
> >> > On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
> >> >
> >> > > Hey Bikash,
> >> > >
> >> > > Maybe you can manually start a  tasktracker on the node and see if
> >> there
> >> > are
> >> > > any error messages. Also, don't forget to check your configure files
> >> for
> >> > > mapreduce and hdfs and make sure datanode can start successfully
> >> first.
> >> > > After all these steps, you can submit a job on the master node and
> see
> >> if
> >> > > there are any communication between these failed nodes and the
> master
> >> > node.
> >> > > Post your error messages here if possible.
> >> > >
> >> > > HTH.
> >> > > Simon -
> >> > >
> >> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
> >> sharmabiks.07@gmail.com
> >> > >wrote:
> >> > >
> >> > >> Thanks James. Well all the config. files and shared keys are on a
> >> shared
> >> > >> storage that is accessed by all the nodes in the cluster.
> >> > >> At times, everything runs fine on initialization, but at other
> times,
> >> > the
> >> > >> same problem persists, so was bit confused.
> >> > >> Also, checked the TaskTracker logs on those nodes, there does not
> >> seem
> >> > to
> >> > >> be
> >> > >> any error.
> >> > >>
> >> > >> -bikash
> >> > >>
> >> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com>
> >> wrote:
> >> > >>
> >> > >>> Maybe your ssh keys aren’t distributed the same on each machine or
> >> the
> >> > >>> machines aren’t configured the same?
> >> > >>>
> >> > >>> J
> >> > >>>
> >> > >>>
> >> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
> >> > >>>
> >> > >>>> Hi,
> >> > >>>> I have a 10 nodes Hadoop cluster, where I am running some
> >> benchmarks
> >> > >> for
> >> > >>>> experiments.
> >> > >>>> Surprisingly, when I initialize the Hadoop cluster
> >> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes
> >> have
> >> > >>>> TaskTracker process up (seen using jps), while other nodes do not
> >> have
> >> > >>>> TaskTrackers. Could anyone please explain?
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Bikash
> >> > >>>
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Regards,
> >> > > Simon
> >> >
> >>
> >
> >
>

Re: TaskTracker not starting on all nodes

Posted by bikash sharma <sh...@gmail.com>.

Hi Sonal,
Thanks. I guess you are right. ps -ef exposes such processes.

-bikash

On Tue, Mar 1, 2011 at 1:29 PM, Sonal Goyal <so...@gmail.com> wrote:

> Bikash,
>
> I have sometimes found hanging processes which jps does not report, but a
> ps -ef shows them. Maybe you can check this on the errant nodes..
>
> Thanks and Regards,
> Sonal
> <https://github.com/sonalgoyal/hiho>Hadoop ETL and Data Integration<https://github.com/sonalgoyal/hiho>
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>
>
>
>
> On Tue, Mar 1, 2011 at 7:37 PM, bikash sharma <sh...@gmail.com>wrote:
>
>> Hi James,
>> Sorry for the late response. No, the same problem persists. I reformatted
>> HDFS, stopped mapred and hdfs daemons and restarted them (using
>> start-dfs.sh
>> and start-mapred.sh from master node). But surprisingly out of 4 nodes
>> cluster, two nodes have TaskTracker running while other two do not have
>> TaskTrackers on them (verified using jps). I guess since I have the Hadoop
>> installed on shared storage, that might be the issue? Btw, how do I start
>> the services independently on each node?
>>
>> -bikash
>> On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com> wrote:
>>
>> > .... Did you get it working?  What was the fix?
>> >
>> > Sent from my mobile. Please excuse the typos.
>> >
>> > On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
>> >
>> > > Hey Bikash,
>> > >
>> > > Maybe you can manually start a  tasktracker on the node and see if
>> there
>> > are
>> > > any error messages. Also, don't forget to check your configure files
>> for
>> > > mapreduce and hdfs and make sure datanode can start successfully
>> first.
>> > > After all these steps, you can submit a job on the master node and see
>> if
>> > > there are any communication between these failed nodes and the master
>> > node.
>> > > Post your error messages here if possible.
>> > >
>> > > HTH.
>> > > Simon -
>> > >
>> > > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <
>> sharmabiks.07@gmail.com
>> > >wrote:
>> > >
>> > >> Thanks James. Well all the config. files and shared keys are on a
>> shared
>> > >> storage that is accessed by all the nodes in the cluster.
>> > >> At times, everything runs fine on initialization, but at other times,
>> > the
>> > >> same problem persists, so was bit confused.
>> > >> Also, checked the TaskTracker logs on those nodes, there does not
>> seem
>> > to
>> > >> be
>> > >> any error.
>> > >>
>> > >> -bikash
>> > >>
>> > >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com>
>> wrote:
>> > >>
>> > >>> Maybe your ssh keys aren’t distributed the same on each machine or
>> the
>> > >>> machines aren’t configured the same?
>> > >>>
>> > >>> J
>> > >>>
>> > >>>
>> > >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
>> > >>>
>> > >>>> Hi,
>> > >>>> I have a 10 nodes Hadoop cluster, where I am running some
>> benchmarks
>> > >> for
>> > >>>> experiments.
>> > >>>> Surprisingly, when I initialize the Hadoop cluster
>> > >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes
>> have
>> > >>>> TaskTracker process up (seen using jps), while other nodes do not
>> have
>> > >>>> TaskTrackers. Could anyone please explain?
>> > >>>>
>> > >>>> Thanks,
>> > >>>> Bikash
>> > >>>
>> > >>>
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Regards,
>> > > Simon
>> >
>>
>
>

Re: TaskTracker not starting on all nodes

Posted by bikash sharma <sh...@gmail.com>.

Hi James,
Sorry for the late response. No, the same problem persists. I reformatted
HDFS, stopped mapred and hdfs daemons and restarted them (using start-dfs.sh
and start-mapred.sh from master node). But surprisingly out of 4 nodes
cluster, two nodes have TaskTracker running while other two do not have
TaskTrackers on them (verified using jps). I guess since I have the Hadoop
installed on shared storage, that might be the issue? Btw, how do I start
the services independently on each node?

-bikash
On Sun, Feb 27, 2011 at 11:05 PM, James Seigel <ja...@tynt.com> wrote:

> .... Did you get it working?  What was the fix?
>
> Sent from my mobile. Please excuse the typos.
>
> On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:
>
> > Hey Bikash,
> >
> > Maybe you can manually start a  tasktracker on the node and see if there
> are
> > any error messages. Also, don't forget to check your configure files for
> > mapreduce and hdfs and make sure datanode can start successfully first.
> > After all these steps, you can submit a job on the master node and see if
> > there are any communication between these failed nodes and the master
> node.
> > Post your error messages here if possible.
> >
> > HTH.
> > Simon -
> >
> > On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <sharmabiks.07@gmail.com
> >wrote:
> >
> >> Thanks James. Well all the config. files and shared keys are on a shared
> >> storage that is accessed by all the nodes in the cluster.
> >> At times, everything runs fine on initialization, but at other times,
> the
> >> same problem persists, so was bit confused.
> >> Also, checked the TaskTracker logs on those nodes, there does not seem
> to
> >> be
> >> any error.
> >>
> >> -bikash
> >>
> >> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com> wrote:
> >>
> >>> Maybe your ssh keys aren’t distributed the same on each machine or the
> >>> machines aren’t configured the same?
> >>>
> >>> J
> >>>
> >>>
> >>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
> >>>
> >>>> Hi,
> >>>> I have a 10 nodes Hadoop cluster, where I am running some benchmarks
> >> for
> >>>> experiments.
> >>>> Surprisingly, when I initialize the Hadoop cluster
> >>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> >>>> TaskTracker process up (seen using jps), while other nodes do not have
> >>>> TaskTrackers. Could anyone please explain?
> >>>>
> >>>> Thanks,
> >>>> Bikash
> >>>
> >>>
> >>
> >
> >
> >
> > --
> > Regards,
> > Simon
>

Re: TaskTracker not starting on all nodes

Posted by James Seigel <ja...@tynt.com>.

.... Did you get it working?  What was the fix?

Sent from my mobile. Please excuse the typos.

On 2011-02-27, at 8:43 PM, Simon <gs...@gmail.com> wrote:

> Hey Bikash,
>
> Maybe you can manually start a  tasktracker on the node and see if there are
> any error messages. Also, don't forget to check your configure files for
> mapreduce and hdfs and make sure datanode can start successfully first.
> After all these steps, you can submit a job on the master node and see if
> there are any communication between these failed nodes and the master node.
> Post your error messages here if possible.
>
> HTH.
> Simon -
>
> On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <sh...@gmail.com>wrote:
>
>> Thanks James. Well all the config. files and shared keys are on a shared
>> storage that is accessed by all the nodes in the cluster.
>> At times, everything runs fine on initialization, but at other times, the
>> same problem persists, so was bit confused.
>> Also, checked the TaskTracker logs on those nodes, there does not seem to
>> be
>> any error.
>>
>> -bikash
>>
>> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com> wrote:
>>
>>> Maybe your ssh keys aren’t distributed the same on each machine or the
>>> machines aren’t configured the same?
>>>
>>> J
>>>
>>>
>>> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
>>>
>>>> Hi,
>>>> I have a 10 nodes Hadoop cluster, where I am running some benchmarks
>> for
>>>> experiments.
>>>> Surprisingly, when I initialize the Hadoop cluster
>>>> (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
>>>> TaskTracker process up (seen using jps), while other nodes do not have
>>>> TaskTrackers. Could anyone please explain?
>>>>
>>>> Thanks,
>>>> Bikash
>>>
>>>
>>
>
>
>
> --
> Regards,
> Simon

Re: TaskTracker not starting on all nodes

Posted by Simon <gs...@gmail.com>.

Hey Bikash,

Maybe you can manually start a  tasktracker on the node and see if there are
any error messages. Also, don't forget to check your configure files for
mapreduce and hdfs and make sure datanode can start successfully first.
After all these steps, you can submit a job on the master node and see if
there are any communication between these failed nodes and the master node.
Post your error messages here if possible.

HTH.
Simon -

On Sat, Feb 26, 2011 at 10:44 AM, bikash sharma <sh...@gmail.com>wrote:

> Thanks James. Well all the config. files and shared keys are on a shared
> storage that is accessed by all the nodes in the cluster.
> At times, everything runs fine on initialization, but at other times, the
> same problem persists, so was bit confused.
> Also, checked the TaskTracker logs on those nodes, there does not seem to
> be
> any error.
>
> -bikash
>
> On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com> wrote:
>
> > Maybe your ssh keys aren’t distributed the same on each machine or the
> > machines aren’t configured the same?
> >
> > J
> >
> >
> > On 2011-02-26, at 8:25 AM, bikash sharma wrote:
> >
> > > Hi,
> > > I have a 10 nodes Hadoop cluster, where I am running some benchmarks
> for
> > > experiments.
> > > Surprisingly, when I initialize the Hadoop cluster
> > > (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> > > TaskTracker process up (seen using jps), while other nodes do not have
> > > TaskTrackers. Could anyone please explain?
> > >
> > > Thanks,
> > > Bikash
> >
> >
>

-- 
Regards,
Simon

Re: TaskTracker not starting on all nodes

Posted by bikash sharma <sh...@gmail.com>.

Thanks James. Well all the config. files and shared keys are on a shared
storage that is accessed by all the nodes in the cluster.
At times, everything runs fine on initialization, but at other times, the
same problem persists, so was bit confused.
Also, checked the TaskTracker logs on those nodes, there does not seem to be
any error.

-bikash

On Sat, Feb 26, 2011 at 10:30 AM, James Seigel <ja...@tynt.com> wrote:

> Maybe your ssh keys aren’t distributed the same on each machine or the
> machines aren’t configured the same?
>
> J
>
>
> On 2011-02-26, at 8:25 AM, bikash sharma wrote:
>
> > Hi,
> > I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
> > experiments.
> > Surprisingly, when I initialize the Hadoop cluster
> > (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> > TaskTracker process up (seen using jps), while other nodes do not have
> > TaskTrackers. Could anyone please explain?
> >
> > Thanks,
> > Bikash
>
>

Re: TaskTracker not starting on all nodes

Posted by James Seigel <ja...@tynt.com>.

Maybe your ssh keys aren’t distributed the same on each machine or the machines aren’t configured the same?

J

On 2011-02-26, at 8:25 AM, bikash sharma wrote:

> Hi,
> I have a 10 nodes Hadoop cluster, where I am running some benchmarks for
> experiments.
> Surprisingly, when I initialize the Hadoop cluster
> (hadoop/bin/start-mapred.sh), in many instances, only some nodes have
> TaskTracker process up (seen using jps), while other nodes do not have
> TaskTrackers. Could anyone please explain?
> 
> Thanks,
> Bikash