You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2011/01/04 00:40:56 UTC

Re: reassignment of .META. region

There's 2 things that influence how long it takes to complete a full RS failure:

 - ZK session timeout
 - amount of data to split

The two are configurable, but with tradeoffs. Smaller timeout means
you're more prone to get RS suicides because of lengthy GC pauses, and
smaller HLogs means you have to force flush memstores more often,
generating more IO load. The rest of the processing takes a few
milliseconds.

So don't expect it to be any faster in 0.90... the next big thing will
be distributed log splitting but it's not there yet.

J-D

On Thu, Dec 30, 2010 at 7:24 PM, Ted Yu <yu...@gmail.com> wrote:
> In 0.20.6, we observed the following when RS hosting .META. was down:
>
> 10-12-30 09:44:32,441 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 10.202.50.107:15007 remote=
> us01-ciqps1-name01.carrieriq.com/10.202.50.100:2181]
> 2010-12-30 09:44:32,446 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2010-12-30 09:44:32,606 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
> locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 8000
> because: Call to us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 failed
> on local exception: java.io.IOException: Connection reset by peer
> 2010-12-30 09:44:40,616 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
> after 1 tries, giving up.
> ...
> 2010-12-30 09:47:36,676 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
> locateRegionInMeta attempt 8 of 10 failed; retrying after sleep of 128000
> because: Failed setting up proxy to
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 after attempts=1
> 2010-12-30 09:49:44,712 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
> after 1 tries, giving up.
>
> Can someone enlighten me how 0.90 improves upon reassignment of .META.
> region in case RS hosting .META. is down ?
>
> I am expecting significant speedup compared to that from 0.20.6
>
> Thanks
>