You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jack Levin <ma...@gmail.com> on 2011/01/08 03:32:34 UTC

log reply failures, how to resolve

Greetings all.  I have been observing some interesting problems that
sometimes making hbase start/restart very hard to achieve.  Here is a
situation:

Power goes out of a rack, and kills some datanodes, and some regionservers.

We power things back on, HDFS reports all datanodes back to normal,
and we cold restart hbase.
Obviously we have some log files in the /hbase/.logs directory on
HDFS.   So, when master starts, it scans that dir and attempts to
replay the logs and insert all the data into the region files, so far
so good...

Now at some instances, we get this message:

hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
121173ms for lease recovery on
hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6%3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
failed to create file
/hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6%3A60020.1294005303076
for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
10.101.7.1, because this file is already being created by NN_Recovery
on 10.101.7.1

Those messages (in master.log), will spew continuously and hbase will
not start.  My understanding that namenode or maybe some datanode is
holding a lease on a file, and master is unable to process it.  Left
by itself, the problem will not go away.  The only way to resolve it,
is to shutdown the master, do

hadoop fs -cp /hbase/.logs/* /tmp/.logs
hadoop fs -rm /hbase/.logs/*
hadoop fs -mv /tmp/.logs/* /hbase/.logs/

Start master, and things are back to normal (all logs replay, master starts).
So, a question -- is there some sort of HDFS setting (are we hitting a
bug), to instruct the lease to be removed automatically? A timer
maybe?  Can master be granted an authority maybe to copy a file into a
new name, and then replay it?  It seems silly that master shouldn't be
able to do that, after all, its an hbase log file anyway.

Next, there is this situation:

2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
Recovery for block blk_-1736208949609845257_8228359 failed  because
recovery from primary datanode 10.101.1.6:50010 failed 6 times.
Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
10.101.1.6:50010. Marking primary datanode as bad.

Here /hbase/.logs/log_name exists, but the data is missing completely.
 It seems this empty file persists after hbase/hdfs crash.  The only
solution is to perform the above (cp, rm, mv), or simply delete those
files by hand.  Now, is it possible that master would do that?
Master should be able to detect invalid files in the .log/ dir and get
rid of them without operators interaction, is there is some sort of
design element that I am simply missing?

Thanks.

-Jack

Re: log reply failures, how to resolve

Posted by Jack Levin <ma...@gmail.com>.
Sure, this does not resolve the lease issue.  To reproduce, just restart the namenode , have hbase hdfs clients fail, then try cold restart of the cluster

-Jack


On Jan 8, 2011, at 6:50 PM, Todd Lipcon <to...@cloudera.com> wrote:

> Hi Jack,
> 
> Do you have a rack topology script set up for HDFS?
> 
> -Todd
> 
> On Fri, Jan 7, 2011 at 6:32 PM, Jack Levin <ma...@gmail.com> wrote:
> 
>> Greetings all.  I have been observing some interesting problems that
>> sometimes making hbase start/restart very hard to achieve.  Here is a
>> situation:
>> 
>> Power goes out of a rack, and kills some datanodes, and some regionservers.
>> 
>> We power things back on, HDFS reports all datanodes back to normal,
>> and we cold restart hbase.
>> Obviously we have some log files in the /hbase/.logs directory on
>> HDFS.   So, when master starts, it scans that dir and attempts to
>> replay the logs and insert all the data into the region files, so far
>> so good...
>> 
>> Now at some instances, we get this message:
>> 
>> hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
>> 20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
>> 121173ms for lease recovery on
>> hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com
>> ,60020,1293990443946/10.103.5.6
>> %3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
>> failed to create file
>> /hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6
>> %3A60020.1294005303076
>> for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
>> 10.101.7.1, because this file is already being created by NN_Recovery
>> on 10.101.7.1
>> 
>> Those messages (in master.log), will spew continuously and hbase will
>> not start.  My understanding that namenode or maybe some datanode is
>> holding a lease on a file, and master is unable to process it.  Left
>> by itself, the problem will not go away.  The only way to resolve it,
>> is to shutdown the master, do
>> 
>> hadoop fs -cp /hbase/.logs/* /tmp/.logs
>> hadoop fs -rm /hbase/.logs/*
>> hadoop fs -mv /tmp/.logs/* /hbase/.logs/
>> 
>> Start master, and things are back to normal (all logs replay, master
>> starts).
>> So, a question -- is there some sort of HDFS setting (are we hitting a
>> bug), to instruct the lease to be removed automatically? A timer
>> maybe?  Can master be granted an authority maybe to copy a file into a
>> new name, and then replay it?  It seems silly that master shouldn't be
>> able to do that, after all, its an hbase log file anyway.
>> 
>> Next, there is this situation:
>> 
>> 2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
>> Recovery for block blk_-1736208949609845257_8228359 failed  because
>> recovery from primary datanode 10.101.1.6:50010 failed 6 times.
>> Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
>> 10.101.1.6:50010. Marking primary datanode as bad.
>> 
>> Here /hbase/.logs/log_name exists, but the data is missing completely.
>> It seems this empty file persists after hbase/hdfs crash.  The only
>> solution is to perform the above (cp, rm, mv), or simply delete those
>> files by hand.  Now, is it possible that master would do that?
>> Master should be able to detect invalid files in the .log/ dir and get
>> rid of them without operators interaction, is there is some sort of
>> design element that I am simply missing?
>> 
>> Thanks.
>> 
>> -Jack
>> 
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera

Re: log reply failures, how to resolve

Posted by Todd Lipcon <to...@cloudera.com>.
Hi Jack,

Do you have a rack topology script set up for HDFS?

-Todd

On Fri, Jan 7, 2011 at 6:32 PM, Jack Levin <ma...@gmail.com> wrote:

> Greetings all.  I have been observing some interesting problems that
> sometimes making hbase start/restart very hard to achieve.  Here is a
> situation:
>
> Power goes out of a rack, and kills some datanodes, and some regionservers.
>
> We power things back on, HDFS reports all datanodes back to normal,
> and we cold restart hbase.
> Obviously we have some log files in the /hbase/.logs directory on
> HDFS.   So, when master starts, it scans that dir and attempts to
> replay the logs and insert all the data into the region files, so far
> so good...
>
> Now at some instances, we get this message:
>
> hbase-root-master-rdag1.prod.imageshack.com.log.2011-01-02:2011-01-02
> 20:47:37,343 WARN org.apache.hadoop.hbase.util.FSUtils: Waited
> 121173ms for lease recovery on
> hdfs://namenode:9000/hbase/.logs/mtae6.prod.imageshack.com
> ,60020,1293990443946/10.103.5.6
> %3A60020.1294005303076:org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
> failed to create file
> /hbase/.logs/mtae6.prod.imageshack.com,60020,1293990443946/10.103.5.6
> %3A60020.1294005303076
> for DFSClient_hb_m_10.101.7.1:60000_1294029805305 on client
> 10.101.7.1, because this file is already being created by NN_Recovery
> on 10.101.7.1
>
> Those messages (in master.log), will spew continuously and hbase will
> not start.  My understanding that namenode or maybe some datanode is
> holding a lease on a file, and master is unable to process it.  Left
> by itself, the problem will not go away.  The only way to resolve it,
> is to shutdown the master, do
>
> hadoop fs -cp /hbase/.logs/* /tmp/.logs
> hadoop fs -rm /hbase/.logs/*
> hadoop fs -mv /tmp/.logs/* /hbase/.logs/
>
> Start master, and things are back to normal (all logs replay, master
> starts).
> So, a question -- is there some sort of HDFS setting (are we hitting a
> bug), to instruct the lease to be removed automatically? A timer
> maybe?  Can master be granted an authority maybe to copy a file into a
> new name, and then replay it?  It seems silly that master shouldn't be
> able to do that, after all, its an hbase log file anyway.
>
> Next, there is this situation:
>
> 2011-01-02 20:56:58,219 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-1736208949609845257_8228359 failed  because
> recovery from primary datanode 10.101.1.6:50010 failed 6 times.
> Pipeline was 10.101.6.1:50010, 10.103.5.8:50010, 10.103.5.6:50010,
> 10.101.1.6:50010. Marking primary datanode as bad.
>
> Here /hbase/.logs/log_name exists, but the data is missing completely.
>  It seems this empty file persists after hbase/hdfs crash.  The only
> solution is to perform the above (cp, rm, mv), or simply delete those
> files by hand.  Now, is it possible that master would do that?
> Master should be able to detect invalid files in the .log/ dir and get
> rid of them without operators interaction, is there is some sort of
> design element that I am simply missing?
>
> Thanks.
>
> -Jack
>



-- 
Todd Lipcon
Software Engineer, Cloudera