You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by 吴限 <in...@gmail.com> on 2011/07/27 17:37:55 UTC

data loss due to a region server going down

Hi everyone. I'd like to run the following *data* *loss* scenario by you to
see if
we are doing something obviously wrong with our setup here.

Setup:
   -cdh3u0
   - Hadoop 0.20.2
   - HBase 0.90.1
   - 1 Master Node running as NameNode & JobTracker
   -zookeeper quorum
   - 2 child nodes running as Datanode, TaskTracker and RegionServer each
   - dfs.replication is set to 1

First, I inserted some data into the hbase a few hours ago.
 Then after a while. I rebooted one of the region servers and waited until
the master responded to that. However, after I checked the table using hbase
shell (I used the "count" command), I noticed that there was a huge amount
of data being lost.
 After I restarted the regionserver which I had rebooted and checked again,
 I found that some of the missing data was got back but there still existed
some data which hadn't been found yet.
At last,after I disabled the table and then enabled the table , I found that
all data was stored in the cluster and there was no data that was lost.

 This is problematic since we are supposed to
replicate at x1, so at least one other node should be able to
theoretically serve the *data* that the downed regionserver can't.

Questions:

   - How can you guys explain this weird situation?
   - Are there way to recover such lost *data*?

Any tips here are definitely appreciated. I'll be happy to provide more
information as well.-0