You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by GitBox <gi...@apache.org> on 2018/07/09 16:25:27 UTC

[GitHub] gianm opened a new issue #5981: Coordinators spinning in balancer

gianm opened a new issue #5981: Coordinators spinning in balancer
URL: https://github.com/apache/incubator-druid/issues/5981
 
 
   When rolling out the 0.12.2 branch to our test clusters, we noticed symptoms that suggest #5927 can hork up coordinators. They report a lot of time spent in these stack traces, and one of them has spent hours now without finishing a run:
   
   ```
   "Coordinator-Exec--0" #120 daemon prio=5 os_prio=0 tid=0x00007f06802d4000 nid=0x20f7 runnable [0x00007f066a7d0000]
      java.lang.Thread.State: RUNNABLE
   	at io.druid.server.coordinator.ReservoirSegmentSampler.getRandomBalancerSegmentHolder(ReservoirSegmentSampler.java:46)
   	at io.druid.server.coordinator.CostBalancerStrategy.pickSegmentToMove(CostBalancerStrategy.java:224)
   	at io.druid.server.coordinator.helper.DruidCoordinatorBalancer.balanceTier(DruidCoordinatorBalancer.java:128)
   	at io.druid.server.coordinator.helper.DruidCoordinatorBalancer.lambda$run$0(DruidCoordinatorBalancer.java:84)
   	at io.druid.server.coordinator.helper.DruidCoordinatorBalancer$$Lambda$52/955068914.accept(Unknown Source)
   	at java.util.HashMap.forEach(HashMap.java:1289)
   	at io.druid.server.coordinator.helper.DruidCoordinatorBalancer.run(DruidCoordinatorBalancer.java:83)
   	at io.druid.server.coordinator.DruidCoordinator$CoordinatorRunnable.run(DruidCoordinator.java:677)
   	at io.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:571)
   	at io.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:564)
   	at io.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:102)
   	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
   	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
   	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   	at java.lang.Thread.run(Thread.java:748)
   ```
   
   So for 0.12.2 we should either revert this patch, or try to achieve the same thing in some other way.
   
   /cc @clintropolis

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
For additional commands, e-mail: dev-help@druid.apache.org