You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2010/12/30 20:24:13 UTC

reassignment of .META. region

In 0.20.6, we observed the following when RS hosting .META. was down:

10-12-30 09:44:32,441 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
10.202.50.107:15007 remote=
us01-ciqps1-name01.carrieriq.com/10.202.50.100:2181]
2010-12-30 09:44:32,446 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2010-12-30 09:44:32,606 INFO
org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 8000
because: Call to us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 failed
on local exception: java.io.IOException: Connection reset by peer
2010-12-30 09:44:40,616 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
after 1 tries, giving up.
...
2010-12-30 09:47:36,676 INFO
org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
locateRegionInMeta attempt 8 of 10 failed; retrying after sleep of 128000
because: Failed setting up proxy to
us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 after attempts=1
2010-12-30 09:49:44,712 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
after 1 tries, giving up.

Can someone enlighten me how 0.90 improves upon reassignment of .META.
region in case RS hosting .META. is down ?

I am expecting significant speedup compared to that from 0.20.6

Thanks

Re: reassignment of .META. region

Posted by Jean-Daniel Cryans <jd...@apache.org>.

There's 2 things that influence how long it takes to complete a full RS failure:

 - ZK session timeout
 - amount of data to split

The two are configurable, but with tradeoffs. Smaller timeout means
you're more prone to get RS suicides because of lengthy GC pauses, and
smaller HLogs means you have to force flush memstores more often,
generating more IO load. The rest of the processing takes a few
milliseconds.

So don't expect it to be any faster in 0.90... the next big thing will
be distributed log splitting but it's not there yet.

J-D

On Thu, Dec 30, 2010 at 7:24 PM, Ted Yu <yu...@gmail.com> wrote:
> In 0.20.6, we observed the following when RS hosting .META. was down:
>
> 10-12-30 09:44:32,441 INFO org.apache.zookeeper.ClientCnxn: Priming
> connection to java.nio.channels.SocketChannel[connected local=/
> 10.202.50.107:15007 remote=
> us01-ciqps1-name01.carrieriq.com/10.202.50.100:2181]
> 2010-12-30 09:44:32,446 INFO org.apache.zookeeper.ClientCnxn: Server
> connection successful
> 2010-12-30 09:44:32,606 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
> locateRegionInMeta attempt 0 of 10 failed; retrying after sleep of 8000
> because: Call to us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 failed
> on local exception: java.io.IOException: Connection reset by peer
> 2010-12-30 09:44:40,616 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
> after 1 tries, giving up.
> ...
> 2010-12-30 09:47:36,676 INFO
> org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
> locateRegionInMeta attempt 8 of 10 failed; retrying after sleep of 128000
> because: Failed setting up proxy to
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 after attempts=1
> 2010-12-30 09:49:44,712 INFO org.apache.hadoop.ipc.HbaseRPC: Server at
> us01-ciqps1-grid01.carrieriq.com/10.202.50.101:60020 could not be reached
> after 1 tries, giving up.
>
> Can someone enlighten me how 0.90 improves upon reassignment of .META.
> region in case RS hosting .META. is down ?
>
> I am expecting significant speedup compared to that from 0.20.6
>
> Thanks
>