You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Taylor, Ronald C" <ro...@pnl.gov> on 2011/03/17 18:55:14 UTC

problem bringing Hbase back up after power outage and removal of nodes

Folks,

We had a power outage here, and we are trying to bring our Hadoop/HBase cluster back up. Hadoop has been just fine - came up smoothly. HBase has not. Our HBase master log file is filled with just one msg:

2011-03-17 10:50:08,712 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode...
2011-03-17 10:50:18,714 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting for dfs to exit safe mode...

After the power outage, before we tried bringing things back up, we took several nodes off-line, as their hard drives were failing and needed replacing. Don't know if the loss of those nodes has anything to do with the error msg that we are seeing.

Could anybody give us some advice as to where to look for the cause of the HBase failure? We would very much appreciate guidance.

Ron

Ronald Taylor, Ph.D.
Computatational Biology & Bioinformatics Group
Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
Richland, WA 99352
phone: (509) 372-6568
email: ronald.taylor@pnl.gov



Re: problem bringing Hbase back up after power outage and removal of nodes

Posted by Ryan Rawson <ry...@gmail.com>.
If you are in safe mode it's because not all datanodes have reported
in.  So actually NO your hadoop did NOT come up properly.

Check your nn pages, look for any missing nodes.  It won't help you
any more than telling you what is online or not.

Good luck!
-ryan

On Thu, Mar 17, 2011 at 11:12 AM, Vishal Kapoor
<vi...@gmail.com> wrote:
> you should have more info on why dfs is in the safe mode in the logs,
> you can always leave safe mode
>
> hadoop dfs -safemode leave
>
> but again, thats a symptom, not a problem.
>
> Vishal
>
> On Thu, Mar 17, 2011 at 1:55 PM, Taylor, Ronald C <ro...@pnl.gov>wrote:
>
>> Folks,
>>
>> We had a power outage here, and we are trying to bring our Hadoop/HBase
>> cluster back up. Hadoop has been just fine - came up smoothly. HBase has
>> not. Our HBase master log file is filled with just one msg:
>>
>> 2011-03-17 10:50:08,712 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
>> for dfs to exit safe mode...
>> 2011-03-17 10:50:18,714 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
>> for dfs to exit safe mode...
>>
>> After the power outage, before we tried bringing things back up, we took
>> several nodes off-line, as their hard drives were failing and needed
>> replacing. Don't know if the loss of those nodes has anything to do with the
>> error msg that we are seeing.
>>
>> Could anybody give us some advice as to where to look for the cause of the
>> HBase failure? We would very much appreciate guidance.
>>
>> Ron
>>
>> Ronald Taylor, Ph.D.
>> Computatational Biology & Bioinformatics Group
>> Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
>> Richland, WA 99352
>> phone: (509) 372-6568
>> email: ronald.taylor@pnl.gov
>>
>>
>>
>

RE: problem bringing Hbase back up after power outage and removal of nodes

Posted by "Taylor, Ronald C" <ro...@pnl.gov>.
Ryan, Vishal,

Yep, right after I sent the email we figured out that the problem was on the Hadoop side. We are tracking it down; thanks for the very quick responses.
Ron

Ronald Taylor, Ph.D.
Computatational Biology & Bioinformatics Group
Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
Richland, WA 99352
phone: (509) 372-6568
email: ronald.taylor@pnl.gov


-----Original Message-----
From: Vishal Kapoor [mailto:vishal.kapoor.in@gmail.com] 
Sent: Thursday, March 17, 2011 11:12 AM
To: user@hbase.apache.org
Subject: Re: problem bringing Hbase back up after power outage and removal of nodes

you should have more info on why dfs is in the safe mode in the logs,
you can always leave safe mode

hadoop dfs -safemode leave

but again, thats a symptom, not a problem.

Vishal

On Thu, Mar 17, 2011 at 1:55 PM, Taylor, Ronald C <ro...@pnl.gov>wrote:

> Folks,
>
> We had a power outage here, and we are trying to bring our Hadoop/HBase
> cluster back up. Hadoop has been just fine - came up smoothly. HBase has
> not. Our HBase master log file is filled with just one msg:
>
> 2011-03-17 10:50:08,712 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
> for dfs to exit safe mode...
> 2011-03-17 10:50:18,714 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
> for dfs to exit safe mode...
>
> After the power outage, before we tried bringing things back up, we took
> several nodes off-line, as their hard drives were failing and needed
> replacing. Don't know if the loss of those nodes has anything to do with the
> error msg that we are seeing.
>
> Could anybody give us some advice as to where to look for the cause of the
> HBase failure? We would very much appreciate guidance.
>
> Ron
>
> Ronald Taylor, Ph.D.
> Computatational Biology & Bioinformatics Group
> Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
> Richland, WA 99352
> phone: (509) 372-6568
> email: ronald.taylor@pnl.gov
>
>
>

Re: problem bringing Hbase back up after power outage and removal of nodes

Posted by Vishal Kapoor <vi...@gmail.com>.
you should have more info on why dfs is in the safe mode in the logs,
you can always leave safe mode

hadoop dfs -safemode leave

but again, thats a symptom, not a problem.

Vishal

On Thu, Mar 17, 2011 at 1:55 PM, Taylor, Ronald C <ro...@pnl.gov>wrote:

> Folks,
>
> We had a power outage here, and we are trying to bring our Hadoop/HBase
> cluster back up. Hadoop has been just fine - came up smoothly. HBase has
> not. Our HBase master log file is filled with just one msg:
>
> 2011-03-17 10:50:08,712 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
> for dfs to exit safe mode...
> 2011-03-17 10:50:18,714 INFO org.apache.hadoop.hbase.util.FSUtils: Waiting
> for dfs to exit safe mode...
>
> After the power outage, before we tried bringing things back up, we took
> several nodes off-line, as their hard drives were failing and needed
> replacing. Don't know if the loss of those nodes has anything to do with the
> error msg that we are seeing.
>
> Could anybody give us some advice as to where to look for the cause of the
> HBase failure? We would very much appreciate guidance.
>
> Ron
>
> Ronald Taylor, Ph.D.
> Computatational Biology & Bioinformatics Group
> Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle)
> Richland, WA 99352
> phone: (509) 372-6568
> email: ronald.taylor@pnl.gov
>
>
>