You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Xudong Du <an...@gmail.com> on 2009/03/15 14:58:32 UTC

[ask for help]Bad connection to FS. command aborted.

Hello, everyone. I met a strange problem and I am looking forward to your
help.

I setup a 2 nodes cluster hadoop on 2 mechine runing ubuntu. and 1 is both
master and slave and the other one is slave, and it seems to be functional.
but yesterday, it does not work anymore.

when I type hadoop command such as "bin/hadoop dfs -ls", it shows that "Bad
connection to FS. command aborted."
so, I run "bin/start-all.sh" to see if the dfs is shut, however it shows
that:

(***.***.***.*** stands for the master's ip address; ###.###.### stands for
the other one's ip address)
starting namenode, logging to *****
***.***.***.***: datanode running as process 14800. Stop it first.
***.***.***.***: datanode running as process 9113. Stop it first.
***.***.***.***: starting secondarynamenode, logging to ***
jobtracker running as process 14946. Stop it first.
***.***.***.***: tasktracker running as process 15029. Stop it first.
###.###.###.###: tasktracker running as process 9217. Stop it first.

and i type "bin/hadoop dfs -ls", but again: Bad connection to FS. command
aborted.

then when I type "bin/hadoop dfs -ls", it shows:
(***.***.***.*** stands for the master's ip address; ###.###.### stands for
the other one's ip address)
stopping jobtracker
***.***.***.***: stopping tasktracker
###.###.###: stopping tasktracker
no namenode to stop  <--------this is very strange
***.***.***.***: stopping datanode
###.###.###: stopping datanode
***.***.***.***: no secondarynamenode to stop

when i restart it by "bin/start-all.sh", it still does not work..

can anyone give me some idea? Thanks a lot.

-- 
Yours Sincerely
Xudong Du
Zijing 2# 305A
Tsinghua University, Beijing, China, 100084

Re: [ask for help]Bad connection to FS. command aborted.

Posted by Vadim Zaliva <lo...@codeminders.com>.
On Mar 24, 2009, at 14:44 , Grant Mackey wrote:

> Is this your first time using Hadoop? or was this cluster running  
> and then broke? were you trying to update hadoop?


Grant,

The cluster is running for a while and I have quite a few tasks which  
I am executing regularly. It is just and
intermittent problem which I observe only with PIG on some of tasks.

 From googling it seems like some deadlock bug in Java threading which  
people reported on some other java programs,
not only pig/hadoop.

Vadim

--
"La perfection est atteinte non quand il ne reste rien a ajouter, mais
quand il ne reste rien a enlever."  (Antoine de Saint-Exupery)




Re: Fwd: [ask for help]Bad connection to FS. command aborted.

Posted by Grant Mackey <gm...@cs.ucf.edu>.
Is this your first time using Hadoop? or was this cluster running and  
then broke? were you trying to update hadoop?

If this is the first time you've run it, then did you format the  
namenode? if not then run bin/hadoop namenode -format.

start the cluster up using start-dfs.sh, that way you don't have to  
worry abou the job/tasktrackers...

log into whatever node you have established as the master and simply  
run bin/hadoop namenode to see what error reporting you get. generally  
this will give you a better idea of what is happening. let me know the  
output of the namenode before I can help.

thanks

  - Grant

Quoting Xudong Du <an...@gmail.com>:

> just now, I miswrite a command in the email, it is highlighted in red now.
> Sorry. waiting for your help. thanks a lot.
>
> ---------- Forwarded message ----------
> From: Xudong Du <an...@gmail.com>
> Date: Sun, Mar 15, 2009 at 9:58 PM
> Subject: [ask for help]Bad connection to FS. command aborted.
> To: general@hadoop.apache.org
>
>
> Hello, everyone. I met a strange problem and I am looking forward to your
> help.
>
> I setup a 2 nodes cluster hadoop on 2 mechine runing ubuntu. and 1 is both
> master and slave and the other one is slave, and it seems to be functional.
> but yesterday, it does not work anymore.
>
> when I type hadoop command such as "bin/hadoop dfs -ls", it shows that "Bad
> connection to FS. command aborted."
> so, I run "bin/start-all.sh" to see if the dfs is shut, however it shows
> that:
>
> (***.***.***.*** stands for the master's ip address; ###.###.### stands for
> the other one's ip address)
> starting namenode, logging to *****
> ***.***.***.***: datanode running as process 14800. Stop it first.
> ***.***.***.***: datanode running as process 9113. Stop it first.
> ***.***.***.***: starting secondarynamenode, logging to ***
> jobtracker running as process 14946. Stop it first.
> ***.***.***.***: tasktracker running as process 15029. Stop it first.
> ###.###.###.###: tasktracker running as process 9217. Stop it first.
>
> and i type "bin/hadoop dfs -ls", but again: Bad connection to FS. command
> aborted.
>
> then when I type "bin/stop-all.sh(sorry, I mis-write just now )", it shows:
> (***.***.***.*** stands for the master's ip address; ###.###.### stands for
> the other one's ip address)
> stopping jobtracker
> ***.***.***.***: stopping tasktracker
> ###.###.###: stopping tasktracker
> no namenode to stop  <--------this is very strange
> ***.***.***.***: stopping datanode
> ###.###.###: stopping datanode
> ***.***.***.***: no secondarynamenode to stop
>
> when i restart it by "bin/start-all.sh", it still does not work..
>
> can anyone give me some idea? Thanks a lot.
>
> --
> Yours Sincerely
> Xudong Du
> Zijing 2# 305A
> Tsinghua University, Beijing, China, 100084
>
>
>
> --
> Yours Sincerely
> Xudong Du
> Zijing 2# 305A
> Tsinghua University, Beijing, China, 100084
>



Grant Mackey
UCF Research Assistant
Engineering III
Rm 238 Cubicle 1

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


Fwd: [ask for help]Bad connection to FS. command aborted.

Posted by Xudong Du <an...@gmail.com>.
just now, I miswrite a command in the email, it is highlighted in red now.
Sorry. waiting for your help. thanks a lot.

---------- Forwarded message ----------
From: Xudong Du <an...@gmail.com>
Date: Sun, Mar 15, 2009 at 9:58 PM
Subject: [ask for help]Bad connection to FS. command aborted.
To: general@hadoop.apache.org


Hello, everyone. I met a strange problem and I am looking forward to your
help.

I setup a 2 nodes cluster hadoop on 2 mechine runing ubuntu. and 1 is both
master and slave and the other one is slave, and it seems to be functional.
but yesterday, it does not work anymore.

when I type hadoop command such as "bin/hadoop dfs -ls", it shows that "Bad
connection to FS. command aborted."
so, I run "bin/start-all.sh" to see if the dfs is shut, however it shows
that:

(***.***.***.*** stands for the master's ip address; ###.###.### stands for
the other one's ip address)
starting namenode, logging to *****
***.***.***.***: datanode running as process 14800. Stop it first.
***.***.***.***: datanode running as process 9113. Stop it first.
***.***.***.***: starting secondarynamenode, logging to ***
jobtracker running as process 14946. Stop it first.
***.***.***.***: tasktracker running as process 15029. Stop it first.
###.###.###.###: tasktracker running as process 9217. Stop it first.

and i type "bin/hadoop dfs -ls", but again: Bad connection to FS. command
aborted.

then when I type "bin/stop-all.sh(sorry, I mis-write just now )", it shows:
(***.***.***.*** stands for the master's ip address; ###.###.### stands for
the other one's ip address)
stopping jobtracker
***.***.***.***: stopping tasktracker
###.###.###: stopping tasktracker
no namenode to stop  <--------this is very strange
***.***.***.***: stopping datanode
###.###.###: stopping datanode
***.***.***.***: no secondarynamenode to stop

when i restart it by "bin/start-all.sh", it still does not work..

can anyone give me some idea? Thanks a lot.

-- 
Yours Sincerely
Xudong Du
Zijing 2# 305A
Tsinghua University, Beijing, China, 100084



-- 
Yours Sincerely
Xudong Du
Zijing 2# 305A
Tsinghua University, Beijing, China, 100084