You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by la...@apache.org on 2015/12/05 08:14:58 UTC

master unhealthy issue in JitterScheduledThreadPoolExecutorImpl, or is it just me?

I see that locally all tests that start a mini cluster fail.
In the log I see 1000's of messages like these:2015-12-04 22:55:48,215 ERROR [newbunny,41236,1449298547569_ChoreService_107] se
rver.NIOServerCnxnFactory$1(44): Thread Thread[newbunny,41236,1449298547569_ChoreService_107,5,main] died
java.lang.IllegalArgumentException: bound must be greater than origin
        at java.util.concurrent.ThreadLocalRandom.nextLong(ThreadLocalRandom.java:430)
        at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.getDelay(JitterScheduledThreadPoolExecutorImpl.java:84)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1083)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

In JitteredRunnableScheduledFuture.getDelay I see this.
      long baseDelay = wrapped.getDelay(unit);
      long spreadTime = (long) (baseDelay * spread);
      long delay = baseDelay + ThreadLocalRandom.current().nextLong(-spreadTime, spreadTime);

So this can fail when spreadTime is 0 (or negative).I suppose to fix is simple not add the spread if spreadTime if <= 0. And it indeed this fixes the problem for me.

Elliot, you just added that class, mind having a look? Or I'll just file a jira.

Thanks.
-- Lars

Re: master unhealthy issue in JitterScheduledThreadPoolExecutorImpl, or is it just me?

Posted by la...@apache.org.
Commented on HBASE-14922, which introduces this class, along with a proposed fix.
Thanks.
-- Lars
      From: "larsh@apache.org" <la...@apache.org>
 To: HBase Dev List <de...@hbase.apache.org>; Elliott Clark <ec...@apache.org> 
 Sent: Friday, December 4, 2015 11:14 PM
 Subject: master unhealthy issue in JitterScheduledThreadPoolExecutorImpl, or is it just me?
   
I see that locally all tests that start a mini cluster fail.
In the log I see 1000's of messages like these:2015-12-04 22:55:48,215 ERROR [newbunny,41236,1449298547569_ChoreService_107] se
rver.NIOServerCnxnFactory$1(44): Thread Thread[newbunny,41236,1449298547569_ChoreService_107,5,main] died
java.lang.IllegalArgumentException: bound must be greater than origin
        at java.util.concurrent.ThreadLocalRandom.nextLong(ThreadLocalRandom.java:430)
        at org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.getDelay(JitterScheduledThreadPoolExecutorImpl.java:84)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1083)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

In JitteredRunnableScheduledFuture.getDelay I see this.
      long baseDelay = wrapped.getDelay(unit);
      long spreadTime = (long) (baseDelay * spread);
      long delay = baseDelay + ThreadLocalRandom.current().nextLong(-spreadTime, spreadTime);

So this can fail when spreadTime is 0 (or negative).I suppose to fix is simple not add the spread if spreadTime if <= 0. And it indeed this fixes the problem for me.

Elliot, you just added that class, mind having a look? Or I'll just file a jira.

Thanks.
-- Lars