You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Bob Schulze <bs...@gmx.de> on 2010/04/09 09:52:55 UTC

Hbase stuck after some hours

I repeatedly have the following problem with
0.20.3/dfs.datanode.socket.write.timeout=0: Some RS is requested for
some data, the DFS can not find it, client hangs until timeout.

Grepping the cluster logs, I can see this:

1. at some time the DFS is asked to delete a block, blocks are deleted
from the datanodes

2. some minutes later, a RS seems to ask for exactly this block...DFS
says "Block blk_.. is not valid." and then "No live nodes contain
current block".

(I have xceivers and file desc limit high, dfs.datanode.handler.count=10)

More log here: http://pastebin.com/cdqsy8Ae

?

Thx, Al