You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sandeep L <sa...@outlook.com> on 2013/11/27 05:59:52 UTC

Suddenly NameNode stopped responding

Hi,
Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).
Here mysterious thing is we unable to get any reason for NameNode interruption.
I went through all log files of NameNode and I couldn't find any exception in logs.
Can someone guess what could be the probable reason for this issue?
Any one previously faced similar issue?
We are using hbase-0.92.1 with hadoop-1.0.2
If you need any other information please let me know. 
Thanks,Sandeep. 		 	   		  

Re: Suddenly NameNode stopped responding

Posted by Azuryy Yu <az...@gmail.com>.
Sandeep,

and please take a look here http://hbase.apache.org/book.html#hadoop

PS: HDFSv2 supports HA.



On Thu, Nov 28, 2013 at 2:31 PM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Thanks for update.
> After spending quite a bit of time on Hadoop/HBase I couldn't find any
> thing awkward in logs.
> At last what I got to know is the reason for outage is IO Error thrown by
> the one of disk in which we are storing NameNode files.
>
> One more suggestion we need is regarding NameNode HA.
> Since we are using hbase-0.94.1 which version of Hadoop we should apt for
> NameNode HA.
> We can't move away from HBase 0.94.1 in near future, and we want to adapt
> NameNode HA.
> Can someone suggest us some suitable solutions for us?
>
> Thanks,Sandeep.
>
> From: bharathv@cloudera.com
> Date: Wed, 27 Nov 2013 10:56:44 +0530
> Subject: Re: Suddenly NameNode stopped responding
> To: user@hbase.apache.org
> CC: user@hadoop.apache.org
>
> It is difficult to guess the reason behind this outage without the logs.
> Can we have a look at them? (pastebin). Did you configure HA for namenode?
> Did it failover to standby?
>
>
>
> On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>
> wrote:
>
>
> Hi,
>
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
>
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
>
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
>
> Can someone guess what could be the probable reason for this issue?
>
> Any one previously faced similar issue?
>
> We are using hbase-0.92.1 with hadoop-1.0.2
>
> If you need any other information please let me know.
>
> Thanks,Sandeep.
>
> --
> Bharath Vissapragada
>
>
>
>
>

RE: Suddenly NameNode stopped responding

Posted by Sandeep L <sa...@outlook.com>.
Hi,
Thanks for update. 
After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs.
At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files.

One more suggestion we need is regarding NameNode HA.
Since we are using hbase-0.94.1 which version of Hadoop we should apt for NameNode HA.
We can't move away from HBase 0.94.1 in near future, and we want to adapt NameNode HA.
Can someone suggest us some suitable solutions for us?

Thanks,Sandeep.

From: bharathv@cloudera.com
Date: Wed, 27 Nov 2013 10:56:44 +0530
Subject: Re: Suddenly NameNode stopped responding
To: user@hbase.apache.org
CC: user@hadoop.apache.org

It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby?



On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com> wrote:


Hi,

Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).

Here mysterious thing is we unable to get any reason for NameNode interruption.

I went through all log files of NameNode and I couldn't find any exception in logs.

Can someone guess what could be the probable reason for this issue?

Any one previously faced similar issue?

We are using hbase-0.92.1 with hadoop-1.0.2

If you need any other information please let me know.

Thanks,Sandeep.                                           

-- 
Bharath Vissapragada



 		 	   		  

RE: Suddenly NameNode stopped responding

Posted by Sandeep L <sa...@outlook.com>.
Hi,
Thanks for update. 
After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs.
At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files.

One more suggestion we need is regarding NameNode HA.
Since we are using hbase-0.94.1 which version of Hadoop we should apt for NameNode HA.
We can't move away from HBase 0.94.1 in near future, and we want to adapt NameNode HA.
Can someone suggest us some suitable solutions for us?

Thanks,Sandeep.

From: bharathv@cloudera.com
Date: Wed, 27 Nov 2013 10:56:44 +0530
Subject: Re: Suddenly NameNode stopped responding
To: user@hbase.apache.org
CC: user@hadoop.apache.org

It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby?



On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com> wrote:


Hi,

Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).

Here mysterious thing is we unable to get any reason for NameNode interruption.

I went through all log files of NameNode and I couldn't find any exception in logs.

Can someone guess what could be the probable reason for this issue?

Any one previously faced similar issue?

We are using hbase-0.92.1 with hadoop-1.0.2

If you need any other information please let me know.

Thanks,Sandeep.                                           

-- 
Bharath Vissapragada



 		 	   		  

RE: Suddenly NameNode stopped responding

Posted by Sandeep L <sa...@outlook.com>.
Hi,
Thanks for update. 
After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs.
At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files.

One more suggestion we need is regarding NameNode HA.
Since we are using hbase-0.94.1 which version of Hadoop we should apt for NameNode HA.
We can't move away from HBase 0.94.1 in near future, and we want to adapt NameNode HA.
Can someone suggest us some suitable solutions for us?

Thanks,Sandeep.

From: bharathv@cloudera.com
Date: Wed, 27 Nov 2013 10:56:44 +0530
Subject: Re: Suddenly NameNode stopped responding
To: user@hbase.apache.org
CC: user@hadoop.apache.org

It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby?



On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com> wrote:


Hi,

Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).

Here mysterious thing is we unable to get any reason for NameNode interruption.

I went through all log files of NameNode and I couldn't find any exception in logs.

Can someone guess what could be the probable reason for this issue?

Any one previously faced similar issue?

We are using hbase-0.92.1 with hadoop-1.0.2

If you need any other information please let me know.

Thanks,Sandeep.                                           

-- 
Bharath Vissapragada



 		 	   		  

RE: Suddenly NameNode stopped responding

Posted by Sandeep L <sa...@outlook.com>.
Hi,
Thanks for update. 
After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs.
At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files.

One more suggestion we need is regarding NameNode HA.
Since we are using hbase-0.94.1 which version of Hadoop we should apt for NameNode HA.
We can't move away from HBase 0.94.1 in near future, and we want to adapt NameNode HA.
Can someone suggest us some suitable solutions for us?

Thanks,Sandeep.

From: bharathv@cloudera.com
Date: Wed, 27 Nov 2013 10:56:44 +0530
Subject: Re: Suddenly NameNode stopped responding
To: user@hbase.apache.org
CC: user@hadoop.apache.org

It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby?



On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com> wrote:


Hi,

Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).

Here mysterious thing is we unable to get any reason for NameNode interruption.

I went through all log files of NameNode and I couldn't find any exception in logs.

Can someone guess what could be the probable reason for this issue?

Any one previously faced similar issue?

We are using hbase-0.92.1 with hadoop-1.0.2

If you need any other information please let me know.

Thanks,Sandeep.                                           

-- 
Bharath Vissapragada



 		 	   		  

RE: Suddenly NameNode stopped responding

Posted by Sandeep L <sa...@outlook.com>.
Hi,
Thanks for update. 
After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs.
At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files.

One more suggestion we need is regarding NameNode HA.
Since we are using hbase-0.94.1 which version of Hadoop we should apt for NameNode HA.
We can't move away from HBase 0.94.1 in near future, and we want to adapt NameNode HA.
Can someone suggest us some suitable solutions for us?

Thanks,Sandeep.

From: bharathv@cloudera.com
Date: Wed, 27 Nov 2013 10:56:44 +0530
Subject: Re: Suddenly NameNode stopped responding
To: user@hbase.apache.org
CC: user@hadoop.apache.org

It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby?



On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com> wrote:


Hi,

Couple of hours back all of sudden NameNode of our production cluster got stopped responding, due to this our HBase also stopped responding(as expected).

Here mysterious thing is we unable to get any reason for NameNode interruption.

I went through all log files of NameNode and I couldn't find any exception in logs.

Can someone guess what could be the probable reason for this issue?

Any one previously faced similar issue?

We are using hbase-0.92.1 with hadoop-1.0.2

If you need any other information please let me know.

Thanks,Sandeep.                                           

-- 
Bharath Vissapragada



 		 	   		  

Re: Suddenly NameNode stopped responding

Posted by Azuryy Yu <az...@gmail.com>.
I don't think there is HA because Sandeep using Hadoop-1.0.2.

@Sandeep, did you check GC logs?


On Wed, Nov 27, 2013 at 1:26 PM, Bharath Vissapragada <bharathv@cloudera.com
> wrote:

> It is difficult to guess the reason behind this outage without the logs.
> Can we have a look at them? (pastebin). Did you configure HA for namenode?
> Did it failover to standby?
>
>
> On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sandeepvreddy@outlook.com
> >wrote:
>
> > Hi,
> > Couple of hours back all of sudden NameNode of our production cluster got
> > stopped responding, due to this our HBase also stopped responding(as
> > expected).
> > Here mysterious thing is we unable to get any reason for NameNode
> > interruption.
> > I went through all log files of NameNode and I couldn't find any
> exception
> > in logs.
> > Can someone guess what could be the probable reason for this issue?
> > Any one previously faced similar issue?
> > We are using hbase-0.92.1 with hadoop-1.0.2
> > If you need any other information please let me know.
> > Thanks,Sandeep.
>
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Re: Suddenly NameNode stopped responding

Posted by Bharath Vissapragada <bh...@cloudera.com>.
It is difficult to guess the reason behind this outage without the logs.
Can we have a look at them? (pastebin). Did you configure HA for namenode?
Did it failover to standby?


On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
> Can someone guess what could be the probable reason for this issue?
> Any one previously faced similar issue?
> We are using hbase-0.92.1 with hadoop-1.0.2
> If you need any other information please let me know.
> Thanks,Sandeep.




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Suddenly NameNode stopped responding

Posted by Bharath Vissapragada <bh...@cloudera.com>.
It is difficult to guess the reason behind this outage without the logs.
Can we have a look at them? (pastebin). Did you configure HA for namenode?
Did it failover to standby?


On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
> Can someone guess what could be the probable reason for this issue?
> Any one previously faced similar issue?
> We are using hbase-0.92.1 with hadoop-1.0.2
> If you need any other information please let me know.
> Thanks,Sandeep.




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Suddenly NameNode stopped responding

Posted by Bharath Vissapragada <bh...@cloudera.com>.
It is difficult to guess the reason behind this outage without the logs.
Can we have a look at them? (pastebin). Did you configure HA for namenode?
Did it failover to standby?


On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
> Can someone guess what could be the probable reason for this issue?
> Any one previously faced similar issue?
> We are using hbase-0.92.1 with hadoop-1.0.2
> If you need any other information please let me know.
> Thanks,Sandeep.




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Suddenly NameNode stopped responding

Posted by Bharath Vissapragada <bh...@cloudera.com>.
It is difficult to guess the reason behind this outage without the logs.
Can we have a look at them? (pastebin). Did you configure HA for namenode?
Did it failover to standby?


On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
> Can someone guess what could be the probable reason for this issue?
> Any one previously faced similar issue?
> We are using hbase-0.92.1 with hadoop-1.0.2
> If you need any other information please let me know.
> Thanks,Sandeep.




-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Suddenly NameNode stopped responding

Posted by Bharath Vissapragada <bh...@cloudera.com>.
It is difficult to guess the reason behind this outage without the logs.
Can we have a look at them? (pastebin). Did you configure HA for namenode?
Did it failover to standby?


On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L <sa...@outlook.com>wrote:

> Hi,
> Couple of hours back all of sudden NameNode of our production cluster got
> stopped responding, due to this our HBase also stopped responding(as
> expected).
> Here mysterious thing is we unable to get any reason for NameNode
> interruption.
> I went through all log files of NameNode and I couldn't find any exception
> in logs.
> Can someone guess what could be the probable reason for this issue?
> Any one previously faced similar issue?
> We are using hbase-0.92.1 with hadoop-1.0.2
> If you need any other information please let me know.
> Thanks,Sandeep.




-- 
Bharath Vissapragada
<http://www.cloudera.com>