You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by jason lu <lj...@gmail.com> on 2015/05/29 05:13:54 UTC

ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Hi,
    I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>

 Any idea about that?
  It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.

I found this code, but I have no idea why it happens, why the event is bigger and bigger?

thanks.

   private final class EventProcessor implements Runnable {
      @Override
      public void run() {

        SchedulerEvent event;

        while (!stopped && !Thread.currentThread().isInterrupted()) {
          try {
            event = eventQueue.take();
          } catch (InterruptedException e) {
            LOG.error("Returning, interrupted : " + e);
            return; // TODO: Kill RM.
          }

          try {
            scheduler.handle(event);
          } catch (Throwable t) {
            // An error occurred, but we are shutting down anyway.
            // If it was an InterruptedException, the very act of 
            // shutdown could have caused it and is probably harmless.
            if (stopped) {
              LOG.warn("Exception during shutdown: ", t);
              break;
            }
            LOG.fatal("Error in handling event type " + event.getType()
                + " to the scheduler", t);
            if (shouldExitOnError
                && !ShutdownHookManager.get().isShutdownInProgress()) {
              LOG.info("Exiting, bbye..");
              System.exit(-1);
            }
          }
        }
      }
    }

    @Override
    protected void serviceStop() throws Exception {
      this.stopped = true;
      this.eventProcessor.interrupt();
      try {
        this.eventProcessor.join();
      } catch (InterruptedException e) {
        throw new YarnRuntimeException(e);
      }
      super.serviceStop();
    }

    @Override
    public void handle(SchedulerEvent event) {
      try {
        int qSize = eventQueue.size();
        if (qSize !=0 && qSize %1000 == 0) {
          LOG.info("Size of scheduler event-queue is " + qSize);
        }
        int remCapacity = eventQueue.remainingCapacity();
        if (remCapacity < 1000) {
          LOG.info("Very low remaining capacity on scheduler event queue: "
              + remCapacity);
        }
        this.eventQueue.put(event);
      } catch (InterruptedException e) {
        throw new YarnRuntimeException(e);
      }
    }
  }

logs:

grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000



Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by jason lu <lj...@gmail.com>.
I forgot to do that before restart the process.

> 在 2015年5月29日,11:17,Rohith Sharma <ro...@gmail.com> 写道:
> 
> Hi
> 
> Can you take thread dump and verify it?
> 
> jstack <pid> > RM.out 
> OR
> kill -3 <pid>  (Note : head dump will be logged in out file)
> 
> Thanks & Regards
> Rohith Sharma K S
> 
>> On May 29, 2015, at 8:43 AM, jason lu <ljhn1829@gmail.com <ma...@gmail.com>> wrote:
>> 
>> 
>> Hi,
>>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
>> 
>>  Any idea about that?
>>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
>> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
>> 
>> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
>> 
>> thanks.
>> 
>>    private final class EventProcessor implements Runnable {
>>       @Override
>>       public void run() {
>> 
>>         SchedulerEvent event;
>> 
>>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>>           try {
>>             event = eventQueue.take();
>>           } catch (InterruptedException e) {
>>             LOG.error("Returning, interrupted : " + e);
>>             return; // TODO: Kill RM.
>>           }
>> 
>>           try {
>>             scheduler.handle(event);
>>           } catch (Throwable t) {
>>             // An error occurred, but we are shutting down anyway.
>>             // If it was an InterruptedException, the very act of 
>>             // shutdown could have caused it and is probably harmless.
>>             if (stopped) {
>>               LOG.warn("Exception during shutdown: ", t);
>>               break;
>>             }
>>             LOG.fatal("Error in handling event type " + event.getType()
>>                 + " to the scheduler", t);
>>             if (shouldExitOnError
>>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>>               LOG.info("Exiting, bbye..");
>>               System.exit(-1);
>>             }
>>           }
>>         }
>>       }
>>     }
>> 
>>     @Override
>>     protected void serviceStop() throws Exception {
>>       this.stopped = true;
>>       this.eventProcessor.interrupt();
>>       try {
>>         this.eventProcessor.join();
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>       super.serviceStop();
>>     }
>> 
>>     @Override
>>     public void handle(SchedulerEvent event) {
>>       try {
>>         int qSize = eventQueue.size();
>>         if (qSize !=0 && qSize %1000 == 0) {
>>           LOG.info("Size of scheduler event-queue is " + qSize);
>>         }
>>         int remCapacity = eventQueue.remainingCapacity();
>>         if (remCapacity < 1000) {
>>           LOG.info("Very low remaining capacity on scheduler event queue: "
>>               + remCapacity);
>>         }
>>         this.eventQueue.put(event);
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>     }
>>   }
>> 
>> logs:
>> 
>> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
>> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
>> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
>> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
>> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
>> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
>> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
>> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
>> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
>> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
>> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
>> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
>> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
>> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
>> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
>> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
>> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
>> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
>> 
>> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by jason lu <lj...@gmail.com>.
I forgot to do that before restart the process.

> 在 2015年5月29日,11:17,Rohith Sharma <ro...@gmail.com> 写道:
> 
> Hi
> 
> Can you take thread dump and verify it?
> 
> jstack <pid> > RM.out 
> OR
> kill -3 <pid>  (Note : head dump will be logged in out file)
> 
> Thanks & Regards
> Rohith Sharma K S
> 
>> On May 29, 2015, at 8:43 AM, jason lu <ljhn1829@gmail.com <ma...@gmail.com>> wrote:
>> 
>> 
>> Hi,
>>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
>> 
>>  Any idea about that?
>>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
>> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
>> 
>> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
>> 
>> thanks.
>> 
>>    private final class EventProcessor implements Runnable {
>>       @Override
>>       public void run() {
>> 
>>         SchedulerEvent event;
>> 
>>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>>           try {
>>             event = eventQueue.take();
>>           } catch (InterruptedException e) {
>>             LOG.error("Returning, interrupted : " + e);
>>             return; // TODO: Kill RM.
>>           }
>> 
>>           try {
>>             scheduler.handle(event);
>>           } catch (Throwable t) {
>>             // An error occurred, but we are shutting down anyway.
>>             // If it was an InterruptedException, the very act of 
>>             // shutdown could have caused it and is probably harmless.
>>             if (stopped) {
>>               LOG.warn("Exception during shutdown: ", t);
>>               break;
>>             }
>>             LOG.fatal("Error in handling event type " + event.getType()
>>                 + " to the scheduler", t);
>>             if (shouldExitOnError
>>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>>               LOG.info("Exiting, bbye..");
>>               System.exit(-1);
>>             }
>>           }
>>         }
>>       }
>>     }
>> 
>>     @Override
>>     protected void serviceStop() throws Exception {
>>       this.stopped = true;
>>       this.eventProcessor.interrupt();
>>       try {
>>         this.eventProcessor.join();
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>       super.serviceStop();
>>     }
>> 
>>     @Override
>>     public void handle(SchedulerEvent event) {
>>       try {
>>         int qSize = eventQueue.size();
>>         if (qSize !=0 && qSize %1000 == 0) {
>>           LOG.info("Size of scheduler event-queue is " + qSize);
>>         }
>>         int remCapacity = eventQueue.remainingCapacity();
>>         if (remCapacity < 1000) {
>>           LOG.info("Very low remaining capacity on scheduler event queue: "
>>               + remCapacity);
>>         }
>>         this.eventQueue.put(event);
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>     }
>>   }
>> 
>> logs:
>> 
>> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
>> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
>> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
>> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
>> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
>> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
>> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
>> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
>> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
>> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
>> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
>> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
>> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
>> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
>> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
>> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
>> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
>> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
>> 
>> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by jason lu <lj...@gmail.com>.
I forgot to do that before restart the process.

> 在 2015年5月29日,11:17,Rohith Sharma <ro...@gmail.com> 写道:
> 
> Hi
> 
> Can you take thread dump and verify it?
> 
> jstack <pid> > RM.out 
> OR
> kill -3 <pid>  (Note : head dump will be logged in out file)
> 
> Thanks & Regards
> Rohith Sharma K S
> 
>> On May 29, 2015, at 8:43 AM, jason lu <ljhn1829@gmail.com <ma...@gmail.com>> wrote:
>> 
>> 
>> Hi,
>>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
>> 
>>  Any idea about that?
>>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
>> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
>> 
>> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
>> 
>> thanks.
>> 
>>    private final class EventProcessor implements Runnable {
>>       @Override
>>       public void run() {
>> 
>>         SchedulerEvent event;
>> 
>>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>>           try {
>>             event = eventQueue.take();
>>           } catch (InterruptedException e) {
>>             LOG.error("Returning, interrupted : " + e);
>>             return; // TODO: Kill RM.
>>           }
>> 
>>           try {
>>             scheduler.handle(event);
>>           } catch (Throwable t) {
>>             // An error occurred, but we are shutting down anyway.
>>             // If it was an InterruptedException, the very act of 
>>             // shutdown could have caused it and is probably harmless.
>>             if (stopped) {
>>               LOG.warn("Exception during shutdown: ", t);
>>               break;
>>             }
>>             LOG.fatal("Error in handling event type " + event.getType()
>>                 + " to the scheduler", t);
>>             if (shouldExitOnError
>>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>>               LOG.info("Exiting, bbye..");
>>               System.exit(-1);
>>             }
>>           }
>>         }
>>       }
>>     }
>> 
>>     @Override
>>     protected void serviceStop() throws Exception {
>>       this.stopped = true;
>>       this.eventProcessor.interrupt();
>>       try {
>>         this.eventProcessor.join();
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>       super.serviceStop();
>>     }
>> 
>>     @Override
>>     public void handle(SchedulerEvent event) {
>>       try {
>>         int qSize = eventQueue.size();
>>         if (qSize !=0 && qSize %1000 == 0) {
>>           LOG.info("Size of scheduler event-queue is " + qSize);
>>         }
>>         int remCapacity = eventQueue.remainingCapacity();
>>         if (remCapacity < 1000) {
>>           LOG.info("Very low remaining capacity on scheduler event queue: "
>>               + remCapacity);
>>         }
>>         this.eventQueue.put(event);
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>     }
>>   }
>> 
>> logs:
>> 
>> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
>> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
>> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
>> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
>> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
>> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
>> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
>> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
>> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
>> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
>> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
>> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
>> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
>> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
>> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
>> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
>> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
>> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
>> 
>> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by jason lu <lj...@gmail.com>.
I forgot to do that before restart the process.

> 在 2015年5月29日,11:17,Rohith Sharma <ro...@gmail.com> 写道:
> 
> Hi
> 
> Can you take thread dump and verify it?
> 
> jstack <pid> > RM.out 
> OR
> kill -3 <pid>  (Note : head dump will be logged in out file)
> 
> Thanks & Regards
> Rohith Sharma K S
> 
>> On May 29, 2015, at 8:43 AM, jason lu <ljhn1829@gmail.com <ma...@gmail.com>> wrote:
>> 
>> 
>> Hi,
>>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
>> 
>>  Any idea about that?
>>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
>> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
>> 
>> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
>> 
>> thanks.
>> 
>>    private final class EventProcessor implements Runnable {
>>       @Override
>>       public void run() {
>> 
>>         SchedulerEvent event;
>> 
>>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>>           try {
>>             event = eventQueue.take();
>>           } catch (InterruptedException e) {
>>             LOG.error("Returning, interrupted : " + e);
>>             return; // TODO: Kill RM.
>>           }
>> 
>>           try {
>>             scheduler.handle(event);
>>           } catch (Throwable t) {
>>             // An error occurred, but we are shutting down anyway.
>>             // If it was an InterruptedException, the very act of 
>>             // shutdown could have caused it and is probably harmless.
>>             if (stopped) {
>>               LOG.warn("Exception during shutdown: ", t);
>>               break;
>>             }
>>             LOG.fatal("Error in handling event type " + event.getType()
>>                 + " to the scheduler", t);
>>             if (shouldExitOnError
>>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>>               LOG.info("Exiting, bbye..");
>>               System.exit(-1);
>>             }
>>           }
>>         }
>>       }
>>     }
>> 
>>     @Override
>>     protected void serviceStop() throws Exception {
>>       this.stopped = true;
>>       this.eventProcessor.interrupt();
>>       try {
>>         this.eventProcessor.join();
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>       super.serviceStop();
>>     }
>> 
>>     @Override
>>     public void handle(SchedulerEvent event) {
>>       try {
>>         int qSize = eventQueue.size();
>>         if (qSize !=0 && qSize %1000 == 0) {
>>           LOG.info("Size of scheduler event-queue is " + qSize);
>>         }
>>         int remCapacity = eventQueue.remainingCapacity();
>>         if (remCapacity < 1000) {
>>           LOG.info("Very low remaining capacity on scheduler event queue: "
>>               + remCapacity);
>>         }
>>         this.eventQueue.put(event);
>>       } catch (InterruptedException e) {
>>         throw new YarnRuntimeException(e);
>>       }
>>     }
>>   }
>> 
>> logs:
>> 
>> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
>> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
>> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
>> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
>> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
>> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
>> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
>> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
>> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
>> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
>> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
>> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
>> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
>> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
>> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
>> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
>> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
>> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
>> 
>> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by Rohith Sharma <ro...@gmail.com>.
Hi

Can you take thread dump and verify it?

jstack <pid> > RM.out 
OR
kill -3 <pid>  (Note : head dump will be logged in out file)

Thanks & Regards
Rohith Sharma K S

> On May 29, 2015, at 8:43 AM, jason lu <lj...@gmail.com> wrote:
> 
> 
> Hi,
>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
> 
>  Any idea about that?
>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
> 
> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
> 
> thanks.
> 
>    private final class EventProcessor implements Runnable {
>       @Override
>       public void run() {
> 
>         SchedulerEvent event;
> 
>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>           try {
>             event = eventQueue.take();
>           } catch (InterruptedException e) {
>             LOG.error("Returning, interrupted : " + e);
>             return; // TODO: Kill RM.
>           }
> 
>           try {
>             scheduler.handle(event);
>           } catch (Throwable t) {
>             // An error occurred, but we are shutting down anyway.
>             // If it was an InterruptedException, the very act of 
>             // shutdown could have caused it and is probably harmless.
>             if (stopped) {
>               LOG.warn("Exception during shutdown: ", t);
>               break;
>             }
>             LOG.fatal("Error in handling event type " + event.getType()
>                 + " to the scheduler", t);
>             if (shouldExitOnError
>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>               LOG.info("Exiting, bbye..");
>               System.exit(-1);
>             }
>           }
>         }
>       }
>     }
> 
>     @Override
>     protected void serviceStop() throws Exception {
>       this.stopped = true;
>       this.eventProcessor.interrupt();
>       try {
>         this.eventProcessor.join();
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>       super.serviceStop();
>     }
> 
>     @Override
>     public void handle(SchedulerEvent event) {
>       try {
>         int qSize = eventQueue.size();
>         if (qSize !=0 && qSize %1000 == 0) {
>           LOG.info("Size of scheduler event-queue is " + qSize);
>         }
>         int remCapacity = eventQueue.remainingCapacity();
>         if (remCapacity < 1000) {
>           LOG.info("Very low remaining capacity on scheduler event queue: "
>               + remCapacity);
>         }
>         this.eventQueue.put(event);
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>     }
>   }
> 
> logs:
> 
> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by Rohith Sharma <ro...@gmail.com>.
Hi

Can you take thread dump and verify it?

jstack <pid> > RM.out 
OR
kill -3 <pid>  (Note : head dump will be logged in out file)

Thanks & Regards
Rohith Sharma K S

> On May 29, 2015, at 8:43 AM, jason lu <lj...@gmail.com> wrote:
> 
> 
> Hi,
>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
> 
>  Any idea about that?
>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
> 
> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
> 
> thanks.
> 
>    private final class EventProcessor implements Runnable {
>       @Override
>       public void run() {
> 
>         SchedulerEvent event;
> 
>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>           try {
>             event = eventQueue.take();
>           } catch (InterruptedException e) {
>             LOG.error("Returning, interrupted : " + e);
>             return; // TODO: Kill RM.
>           }
> 
>           try {
>             scheduler.handle(event);
>           } catch (Throwable t) {
>             // An error occurred, but we are shutting down anyway.
>             // If it was an InterruptedException, the very act of 
>             // shutdown could have caused it and is probably harmless.
>             if (stopped) {
>               LOG.warn("Exception during shutdown: ", t);
>               break;
>             }
>             LOG.fatal("Error in handling event type " + event.getType()
>                 + " to the scheduler", t);
>             if (shouldExitOnError
>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>               LOG.info("Exiting, bbye..");
>               System.exit(-1);
>             }
>           }
>         }
>       }
>     }
> 
>     @Override
>     protected void serviceStop() throws Exception {
>       this.stopped = true;
>       this.eventProcessor.interrupt();
>       try {
>         this.eventProcessor.join();
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>       super.serviceStop();
>     }
> 
>     @Override
>     public void handle(SchedulerEvent event) {
>       try {
>         int qSize = eventQueue.size();
>         if (qSize !=0 && qSize %1000 == 0) {
>           LOG.info("Size of scheduler event-queue is " + qSize);
>         }
>         int remCapacity = eventQueue.remainingCapacity();
>         if (remCapacity < 1000) {
>           LOG.info("Very low remaining capacity on scheduler event queue: "
>               + remCapacity);
>         }
>         this.eventQueue.put(event);
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>     }
>   }
> 
> logs:
> 
> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by Rohith Sharma <ro...@gmail.com>.
Hi

Can you take thread dump and verify it?

jstack <pid> > RM.out 
OR
kill -3 <pid>  (Note : head dump will be logged in out file)

Thanks & Regards
Rohith Sharma K S

> On May 29, 2015, at 8:43 AM, jason lu <lj...@gmail.com> wrote:
> 
> 
> Hi,
>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
> 
>  Any idea about that?
>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
> 
> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
> 
> thanks.
> 
>    private final class EventProcessor implements Runnable {
>       @Override
>       public void run() {
> 
>         SchedulerEvent event;
> 
>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>           try {
>             event = eventQueue.take();
>           } catch (InterruptedException e) {
>             LOG.error("Returning, interrupted : " + e);
>             return; // TODO: Kill RM.
>           }
> 
>           try {
>             scheduler.handle(event);
>           } catch (Throwable t) {
>             // An error occurred, but we are shutting down anyway.
>             // If it was an InterruptedException, the very act of 
>             // shutdown could have caused it and is probably harmless.
>             if (stopped) {
>               LOG.warn("Exception during shutdown: ", t);
>               break;
>             }
>             LOG.fatal("Error in handling event type " + event.getType()
>                 + " to the scheduler", t);
>             if (shouldExitOnError
>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>               LOG.info("Exiting, bbye..");
>               System.exit(-1);
>             }
>           }
>         }
>       }
>     }
> 
>     @Override
>     protected void serviceStop() throws Exception {
>       this.stopped = true;
>       this.eventProcessor.interrupt();
>       try {
>         this.eventProcessor.join();
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>       super.serviceStop();
>     }
> 
>     @Override
>     public void handle(SchedulerEvent event) {
>       try {
>         int qSize = eventQueue.size();
>         if (qSize !=0 && qSize %1000 == 0) {
>           LOG.info("Size of scheduler event-queue is " + qSize);
>         }
>         int remCapacity = eventQueue.remainingCapacity();
>         if (remCapacity < 1000) {
>           LOG.info("Very low remaining capacity on scheduler event queue: "
>               + remCapacity);
>         }
>         this.eventQueue.put(event);
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>     }
>   }
> 
> logs:
> 
> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
> 
> 


Re: ResouceManager hung: org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000

Posted by Rohith Sharma <ro...@gmail.com>.
Hi

Can you take thread dump and verify it?

jstack <pid> > RM.out 
OR
kill -3 <pid>  (Note : head dump will be logged in out file)

Thanks & Regards
Rohith Sharma K S

> On May 29, 2015, at 8:43 AM, jason lu <lj...@gmail.com> wrote:
> 
> 
> Hi,
>     I met the same problem as : http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E <http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201303.mbox/%3C482C5F6F-6FEB-4552-99F5-07C8B54ACE20@apache.org%3E>
> 
>  Any idea about that?
>   It almost hadoop every 3 or 4weeks in my cluster(about 150 nodes).
> I check the log, no warn, no error, no exception, but the ResouceManager hung, not crash.
> 
> I found this code, but I have no idea why it happens, why the event is bigger and bigger?
> 
> thanks.
> 
>    private final class EventProcessor implements Runnable {
>       @Override
>       public void run() {
> 
>         SchedulerEvent event;
> 
>         while (!stopped && !Thread.currentThread().isInterrupted()) {
>           try {
>             event = eventQueue.take();
>           } catch (InterruptedException e) {
>             LOG.error("Returning, interrupted : " + e);
>             return; // TODO: Kill RM.
>           }
> 
>           try {
>             scheduler.handle(event);
>           } catch (Throwable t) {
>             // An error occurred, but we are shutting down anyway.
>             // If it was an InterruptedException, the very act of 
>             // shutdown could have caused it and is probably harmless.
>             if (stopped) {
>               LOG.warn("Exception during shutdown: ", t);
>               break;
>             }
>             LOG.fatal("Error in handling event type " + event.getType()
>                 + " to the scheduler", t);
>             if (shouldExitOnError
>                 && !ShutdownHookManager.get().isShutdownInProgress()) {
>               LOG.info("Exiting, bbye..");
>               System.exit(-1);
>             }
>           }
>         }
>       }
>     }
> 
>     @Override
>     protected void serviceStop() throws Exception {
>       this.stopped = true;
>       this.eventProcessor.interrupt();
>       try {
>         this.eventProcessor.join();
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>       super.serviceStop();
>     }
> 
>     @Override
>     public void handle(SchedulerEvent event) {
>       try {
>         int qSize = eventQueue.size();
>         if (qSize !=0 && qSize %1000 == 0) {
>           LOG.info("Size of scheduler event-queue is " + qSize);
>         }
>         int remCapacity = eventQueue.remainingCapacity();
>         if (remCapacity < 1000) {
>           LOG.info("Very low remaining capacity on scheduler event queue: "
>               + remCapacity);
>         }
>         this.eventQueue.put(event);
>       } catch (InterruptedException e) {
>         throw new YarnRuntimeException(e);
>       }
>     }
>   }
> 
> logs:
> 
> grep 'Size of event-queue' yarn-hadoop-resourcemanager-gdc-hm01-formal.i.nease.net.log
> 2015-05-29 00:54:46,985 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 1000
> 2015-05-29 00:55:28,850 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 2000
> 2015-05-29 00:56:10,204 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 3000
> 2015-05-29 00:56:51,995 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 4000
> 2015-05-29 00:57:33,981 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 5000
> 2015-05-29 00:58:15,324 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 6000
> 2015-05-29 00:58:57,111 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 7000
> 2015-05-29 00:59:38,593 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 8000
> 2015-05-29 01:00:20,215 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 9000
> 2015-05-29 01:01:00,559 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 10000
> 2015-05-29 01:01:39,614 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 11000
> 2015-05-29 01:02:21,364 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 12000
> 2015-05-29 01:03:03,233 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 13000
> 2015-05-29 01:03:44,701 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 14000
> 2015-05-29 01:04:26,494 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 15000
> 2015-05-29 01:05:08,180 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 16000
> 2015-05-29 01:05:50,331 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 17000
> 
>