You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2013/08/20 00:33:49 UTC

[jira] [Commented] (HBASE-9267) StochasticLoadBalancer goes over its processing time limit

    [ https://issues.apache.org/jira/browse/HBASE-9267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744378#comment-13744378 ] 

Jean-Daniel Cryans commented on HBASE-9267:
-------------------------------------------

Basically if I do this in the shell on a freshly started master:

bq. (0..100).each do balancer end

The time it takes to balance goes higher most of the time:

{quote}
hbase(main):002:0> (0..100).each do balancer end
true                                                                                                                                                                                                           
0 row(s) in 9.0110 seconds
true                                                                                                                                                                                                           
0 row(s) in 10.1910 seconds
true                                                                                                                                                                                                           
0 row(s) in 11.2180 seconds
true                                                                                                                                                                                                           
0 row(s) in 12.5670 seconds
true                                                                                                                                                                                                           
0 row(s) in 13.8070 seconds
true                                                                                                                                                                                                           
0 row(s) in 17.5170 seconds
true                                                                                                                                                                                                           
0 row(s) in 18.8010 seconds
true                                                                                                                                                                                                           
0 row(s) in 20.2750 seconds
true                                                                                                                                                                                                           
0 row(s) in 22.9260 seconds
true                                                                                                                                                                                                           
0 row(s) in 18.4760 seconds
true                                                                                                                                                                                                           
0 row(s) in 32.1890 seconds
ERROR: com.google.protobuf.ServiceException: java.net.SocketTimeoutException: Call to jdec2hbase0403-1.vpc.cloudera.com/172.25.3.222:60000 failed because java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.25.3.222:42034 remote=jdec2hbase0403-1.vpc.cloudera.com/172.25.3.222:60000]
{quote}

The last one took 59 seconds according to the master log.
                
> StochasticLoadBalancer goes over its processing time limit
> ----------------------------------------------------------
>
>                 Key: HBASE-9267
>                 URL: https://issues.apache.org/jira/browse/HBASE-9267
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.95.2
>            Reporter: Jean-Daniel Cryans
>            Assignee: Elliott Clark
>             Fix For: 0.98.0, 0.95.3
>
>
> I trying out 0.95.2, I left it running over the weekend (8 RS, average load between 12 and 3 regions) and right now the balancer runs for 12 mins:
> bq. 2013-08-19 21:54:45,534 DEBUG [jdec2hbase0403-1.vpc.cloudera.com,60000,1376689696384-BalancerChore] org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Could not find a better load balance plan.  Tried 0 different configurations in 777309ms, and did not find anything with a computed cost less than 36.32576937689094
> It seems it slowly crept up there, yesterday it was doing:
> bq. 2013-08-18 20:53:17,232 DEBUG [jdec2hbase0403-1.vpc.cloudera.com,60000,1376689696384-BalancerChore] org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Could not find a better load balance plan.  Tried 0 different configurations in 257374ms, and did not find anything with a computed cost less than 36.3251082542424
> And originally it was doing 1 minute.
> In the jstack I see a 1000 of these and jstack doesn't want to show me the whole thing:
> bq.  at java.util.SubList$1.nextIndex(AbstractList.java:713)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira