You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Krishna Kishore Bonagiri <wr...@gmail.com> on 2014/08/10 17:29:16 UTC

100% CPU consumption by Resource Manager process

Hi,
  My YARN resource manager is consuming 100% CPU when I am running an
application that is running for about 10 hours, requesting as many as 27000
containers. The CPU consumption was very low at the starting of my
application, and it gradually went high to over 100%. Is this a known issue
or are we doing something wrong?

Every dump of the EVent Processor thread is running
LeafQueue::assignContainers() specifically the for loop below from
LeafQueue.java and seems to be looping through some priority list.

    // Try to assign containers to applications in order
    for (FiCaSchedulerApp application : activeApplications) {
...
        // Schedule in priority order
        for (Priority priority : application.getPriorities()) {

3XMTHREADINFO      "ResourceManager Event Processor"
J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range
from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
(0x4FE8)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
entry count: 1)
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 2)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)

3XMTHREADINFO      "ResourceManager Event Processor"
J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range
from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
3XMCPUTIME               CPU usage total: 42379.604203548 secs
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
(0xDFC0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
entry count: 1)
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 2)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)

3XMTHREADINFO      "ResourceManager Event Processor"
J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
native policy:UNKNOWN)
3XMTHREADINFO2            (native stack address range
from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
3XMCPUTIME               CPU usage total: 42996.394528764 secs
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
(0x741B8)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at
java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
4XESTACKTRACE                at
java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
Code))
4XESTACKTRACE                at
java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
entry count: 1)
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 2)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
Code))
5XESTACKTRACE                   (entered lock:
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
entry count: 1)
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
Code))
4XESTACKTRACE                at
org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)

Thanks,
Kishore

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Thanks Wangda, I think I have reduced this when I was trying to reduce the
container allocation time.

-Kishore


On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
> I think this config is problematic, too small heartbeat-interval will
> cause NM contact RM too often. I would suggest you can set this value
> larger like 1000.
>
> Thanks,
> Wangda
>
>
>
> On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Wangda,
>>   Thanks for the reply, here are the details, please see if you could
>> suggest anything.
>>
>> 1) Number of nodes and running app in the cluster
>> 2 nodes, and I am running my own application that keeps asking for
>> containers,
>> a) running something on the containers,
>> b) releasing the containers,
>> c) ask for more containers with incremented priority value, and repeat
>> the same process
>>
>> 2) What's the version of your Hadoop?
>> apache hadoop-2.4.0
>>
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> No
>>
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>> 50
>>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>>> Hi Krishna,
>>> To get more understanding about the problem, could you please share
>>> following information:
>>> 1) Number of nodes and running app in the cluster
>>> 2) What's the version of your Hadoop?
>>> 3) Have you set
>>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>>> in your configuration?
>>>
>>> Thanks,
>>> Wangda Tan
>>>
>>>
>>>
>>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>>> application that is running for about 10 hours, requesting as many as 27000
>>>> containers. The CPU consumption was very low at the starting of my
>>>> application, and it gradually went high to over 100%. Is this a known issue
>>>> or are we doing something wrong?
>>>>
>>>> Every dump of the EVent Processor thread is running
>>>> LeafQueue::assignContainers() specifically the for loop below from
>>>> LeafQueue.java and seems to be looping through some priority list.
>>>>
>>>>     // Try to assign containers to applications in order
>>>>     for (FiCaSchedulerApp application : activeApplications) {
>>>> ...
>>>>         // Schedule in priority order
>>>>         for (Priority priority : application.getPriorities()) {
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>>> (0x4FE8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>>> (0xDFC0)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
>>>> cycle=475576 (0x741B8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Thanks Wangda, I think I have reduced this when I was trying to reduce the
container allocation time.

-Kishore


On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
> I think this config is problematic, too small heartbeat-interval will
> cause NM contact RM too often. I would suggest you can set this value
> larger like 1000.
>
> Thanks,
> Wangda
>
>
>
> On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Wangda,
>>   Thanks for the reply, here are the details, please see if you could
>> suggest anything.
>>
>> 1) Number of nodes and running app in the cluster
>> 2 nodes, and I am running my own application that keeps asking for
>> containers,
>> a) running something on the containers,
>> b) releasing the containers,
>> c) ask for more containers with incremented priority value, and repeat
>> the same process
>>
>> 2) What's the version of your Hadoop?
>> apache hadoop-2.4.0
>>
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> No
>>
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>> 50
>>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>>> Hi Krishna,
>>> To get more understanding about the problem, could you please share
>>> following information:
>>> 1) Number of nodes and running app in the cluster
>>> 2) What's the version of your Hadoop?
>>> 3) Have you set
>>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>>> in your configuration?
>>>
>>> Thanks,
>>> Wangda Tan
>>>
>>>
>>>
>>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>>> application that is running for about 10 hours, requesting as many as 27000
>>>> containers. The CPU consumption was very low at the starting of my
>>>> application, and it gradually went high to over 100%. Is this a known issue
>>>> or are we doing something wrong?
>>>>
>>>> Every dump of the EVent Processor thread is running
>>>> LeafQueue::assignContainers() specifically the for loop below from
>>>> LeafQueue.java and seems to be looping through some priority list.
>>>>
>>>>     // Try to assign containers to applications in order
>>>>     for (FiCaSchedulerApp application : activeApplications) {
>>>> ...
>>>>         // Schedule in priority order
>>>>         for (Priority priority : application.getPriorities()) {
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>>> (0x4FE8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>>> (0xDFC0)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
>>>> cycle=475576 (0x741B8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Thanks Wangda, I think I have reduced this when I was trying to reduce the
container allocation time.

-Kishore


On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
> I think this config is problematic, too small heartbeat-interval will
> cause NM contact RM too often. I would suggest you can set this value
> larger like 1000.
>
> Thanks,
> Wangda
>
>
>
> On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Wangda,
>>   Thanks for the reply, here are the details, please see if you could
>> suggest anything.
>>
>> 1) Number of nodes and running app in the cluster
>> 2 nodes, and I am running my own application that keeps asking for
>> containers,
>> a) running something on the containers,
>> b) releasing the containers,
>> c) ask for more containers with incremented priority value, and repeat
>> the same process
>>
>> 2) What's the version of your Hadoop?
>> apache hadoop-2.4.0
>>
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> No
>>
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>> 50
>>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>>> Hi Krishna,
>>> To get more understanding about the problem, could you please share
>>> following information:
>>> 1) Number of nodes and running app in the cluster
>>> 2) What's the version of your Hadoop?
>>> 3) Have you set
>>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>>> in your configuration?
>>>
>>> Thanks,
>>> Wangda Tan
>>>
>>>
>>>
>>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>>> application that is running for about 10 hours, requesting as many as 27000
>>>> containers. The CPU consumption was very low at the starting of my
>>>> application, and it gradually went high to over 100%. Is this a known issue
>>>> or are we doing something wrong?
>>>>
>>>> Every dump of the EVent Processor thread is running
>>>> LeafQueue::assignContainers() specifically the for loop below from
>>>> LeafQueue.java and seems to be looping through some priority list.
>>>>
>>>>     // Try to assign containers to applications in order
>>>>     for (FiCaSchedulerApp application : activeApplications) {
>>>> ...
>>>>         // Schedule in priority order
>>>>         for (Priority priority : application.getPriorities()) {
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>>> (0x4FE8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>>> (0xDFC0)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
>>>> cycle=475576 (0x741B8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Thanks Wangda, I think I have reduced this when I was trying to reduce the
container allocation time.

-Kishore


On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
> I think this config is problematic, too small heartbeat-interval will
> cause NM contact RM too often. I would suggest you can set this value
> larger like 1000.
>
> Thanks,
> Wangda
>
>
>
> On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Wangda,
>>   Thanks for the reply, here are the details, please see if you could
>> suggest anything.
>>
>> 1) Number of nodes and running app in the cluster
>> 2 nodes, and I am running my own application that keeps asking for
>> containers,
>> a) running something on the containers,
>> b) releasing the containers,
>> c) ask for more containers with incremented priority value, and repeat
>> the same process
>>
>> 2) What's the version of your Hadoop?
>> apache hadoop-2.4.0
>>
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> No
>>
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>> 50
>>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>>
>>> Hi Krishna,
>>> To get more understanding about the problem, could you please share
>>> following information:
>>> 1) Number of nodes and running app in the cluster
>>> 2) What's the version of your Hadoop?
>>> 3) Have you set
>>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>>> in your configuration?
>>>
>>> Thanks,
>>> Wangda Tan
>>>
>>>
>>>
>>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>>> application that is running for about 10 hours, requesting as many as 27000
>>>> containers. The CPU consumption was very low at the starting of my
>>>> application, and it gradually went high to over 100%. Is this a known issue
>>>> or are we doing something wrong?
>>>>
>>>> Every dump of the EVent Processor thread is running
>>>> LeafQueue::assignContainers() specifically the for loop below from
>>>> LeafQueue.java and seems to be looping through some priority list.
>>>>
>>>>     // Try to assign containers to applications in order
>>>>     for (FiCaSchedulerApp application : activeApplications) {
>>>> ...
>>>>         // Schedule in priority order
>>>>         for (Priority priority : application.getPriorities()) {
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>>> (0x4FE8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>>> (0xDFC0)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
>>>> cycle=475576 (0x741B8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50

I think this config is problematic, too small heartbeat-interval will cause
NM contact RM too often. I would suggest you can set this value larger like
1000.

Thanks,
Wangda



On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Wangda,
>   Thanks for the reply, here are the details, please see if you could
> suggest anything.
>
> 1) Number of nodes and running app in the cluster
> 2 nodes, and I am running my own application that keeps asking for
> containers,
> a) running something on the containers,
> b) releasing the containers,
> c) ask for more containers with incremented priority value, and repeat the
> same process
>
> 2) What's the version of your Hadoop?
> apache hadoop-2.4.0
>
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> No
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
>
>
>
> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>
>> Hi Krishna,
>> To get more understanding about the problem, could you please share
>> following information:
>> 1) Number of nodes and running app in the cluster
>> 2) What's the version of your Hadoop?
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>>
>> Thanks,
>> Wangda Tan
>>
>>
>>
>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>> application that is running for about 10 hours, requesting as many as 27000
>>> containers. The CPU consumption was very low at the starting of my
>>> application, and it gradually went high to over 100%. Is this a known issue
>>> or are we doing something wrong?
>>>
>>> Every dump of the EVent Processor thread is running
>>> LeafQueue::assignContainers() specifically the for loop below from
>>> LeafQueue.java and seems to be looping through some priority list.
>>>
>>>     // Try to assign containers to applications in order
>>>     for (FiCaSchedulerApp application : activeApplications) {
>>> ...
>>>         // Schedule in priority order
>>>         for (Priority priority : application.getPriorities()) {
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>> (0x4FE8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>> (0xDFC0)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>>> (0x741B8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50

I think this config is problematic, too small heartbeat-interval will cause
NM contact RM too often. I would suggest you can set this value larger like
1000.

Thanks,
Wangda



On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Wangda,
>   Thanks for the reply, here are the details, please see if you could
> suggest anything.
>
> 1) Number of nodes and running app in the cluster
> 2 nodes, and I am running my own application that keeps asking for
> containers,
> a) running something on the containers,
> b) releasing the containers,
> c) ask for more containers with incremented priority value, and repeat the
> same process
>
> 2) What's the version of your Hadoop?
> apache hadoop-2.4.0
>
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> No
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
>
>
>
> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>
>> Hi Krishna,
>> To get more understanding about the problem, could you please share
>> following information:
>> 1) Number of nodes and running app in the cluster
>> 2) What's the version of your Hadoop?
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>>
>> Thanks,
>> Wangda Tan
>>
>>
>>
>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>> application that is running for about 10 hours, requesting as many as 27000
>>> containers. The CPU consumption was very low at the starting of my
>>> application, and it gradually went high to over 100%. Is this a known issue
>>> or are we doing something wrong?
>>>
>>> Every dump of the EVent Processor thread is running
>>> LeafQueue::assignContainers() specifically the for loop below from
>>> LeafQueue.java and seems to be looping through some priority list.
>>>
>>>     // Try to assign containers to applications in order
>>>     for (FiCaSchedulerApp application : activeApplications) {
>>> ...
>>>         // Schedule in priority order
>>>         for (Priority priority : application.getPriorities()) {
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>> (0x4FE8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>> (0xDFC0)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>>> (0x741B8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50

I think this config is problematic, too small heartbeat-interval will cause
NM contact RM too often. I would suggest you can set this value larger like
1000.

Thanks,
Wangda



On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Wangda,
>   Thanks for the reply, here are the details, please see if you could
> suggest anything.
>
> 1) Number of nodes and running app in the cluster
> 2 nodes, and I am running my own application that keeps asking for
> containers,
> a) running something on the containers,
> b) releasing the containers,
> c) ask for more containers with incremented priority value, and repeat the
> same process
>
> 2) What's the version of your Hadoop?
> apache hadoop-2.4.0
>
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> No
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
>
>
>
> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>
>> Hi Krishna,
>> To get more understanding about the problem, could you please share
>> following information:
>> 1) Number of nodes and running app in the cluster
>> 2) What's the version of your Hadoop?
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>>
>> Thanks,
>> Wangda Tan
>>
>>
>>
>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>> application that is running for about 10 hours, requesting as many as 27000
>>> containers. The CPU consumption was very low at the starting of my
>>> application, and it gradually went high to over 100%. Is this a known issue
>>> or are we doing something wrong?
>>>
>>> Every dump of the EVent Processor thread is running
>>> LeafQueue::assignContainers() specifically the for loop below from
>>> LeafQueue.java and seems to be looping through some priority list.
>>>
>>>     // Try to assign containers to applications in order
>>>     for (FiCaSchedulerApp application : activeApplications) {
>>> ...
>>>         // Schedule in priority order
>>>         for (Priority priority : application.getPriorities()) {
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>> (0x4FE8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>> (0xDFC0)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>>> (0x741B8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50

I think this config is problematic, too small heartbeat-interval will cause
NM contact RM too often. I would suggest you can set this value larger like
1000.

Thanks,
Wangda



On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi Wangda,
>   Thanks for the reply, here are the details, please see if you could
> suggest anything.
>
> 1) Number of nodes and running app in the cluster
> 2 nodes, and I am running my own application that keeps asking for
> containers,
> a) running something on the containers,
> b) releasing the containers,
> c) ask for more containers with incremented priority value, and repeat the
> same process
>
> 2) What's the version of your Hadoop?
> apache hadoop-2.4.0
>
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> No
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
>
>
>
> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:
>
>> Hi Krishna,
>> To get more understanding about the problem, could you please share
>> following information:
>> 1) Number of nodes and running app in the cluster
>> 2) What's the version of your Hadoop?
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>>
>> Thanks,
>> Wangda Tan
>>
>>
>>
>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>> write2kishore@gmail.com> wrote:
>>
>>> Hi,
>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>> application that is running for about 10 hours, requesting as many as 27000
>>> containers. The CPU consumption was very low at the starting of my
>>> application, and it gradually went high to over 100%. Is this a known issue
>>> or are we doing something wrong?
>>>
>>> Every dump of the EVent Processor thread is running
>>> LeafQueue::assignContainers() specifically the for loop below from
>>> LeafQueue.java and seems to be looping through some priority list.
>>>
>>>     // Try to assign containers to applications in order
>>>     for (FiCaSchedulerApp application : activeApplications) {
>>> ...
>>>         // Schedule in priority order
>>>         for (Priority priority : application.getPriorities()) {
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>> (0x4FE8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>> (0xDFC0)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>>> native policy:UNKNOWN)
>>> 3XMTHREADINFO2            (native stack address range
>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>>> (0x741B8)
>>> 3XMTHREADINFO3           Java callstack:
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>> entry count: 1)
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 2)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>> Code))
>>> 5XESTACKTRACE                   (entered lock:
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>> entry count: 1)
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>> Code))
>>> 4XESTACKTRACE                at
>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>
>>> Thanks,
>>> Kishore
>>>
>>
>>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Wangda,
  Thanks for the reply, here are the details, please see if you could
suggest anything.

1) Number of nodes and running app in the cluster
2 nodes, and I am running my own application that keeps asking for
containers,
a) running something on the containers,
b) releasing the containers,
c) ask for more containers with incremented priority value, and repeat the
same process

2) What's the version of your Hadoop?
apache hadoop-2.4.0

3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
No

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50




On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
> To get more understanding about the problem, could you please share
> following information:
> 1) Number of nodes and running app in the cluster
> 2) What's the version of your Hadoop?
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
>
> Thanks,
> Wangda Tan
>
>
>
> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>   My YARN resource manager is consuming 100% CPU when I am running an
>> application that is running for about 10 hours, requesting as many as 27000
>> containers. The CPU consumption was very low at the starting of my
>> application, and it gradually went high to over 100%. Is this a known issue
>> or are we doing something wrong?
>>
>> Every dump of the EVent Processor thread is running
>> LeafQueue::assignContainers() specifically the for loop below from
>> LeafQueue.java and seems to be looping through some priority list.
>>
>>     // Try to assign containers to applications in order
>>     for (FiCaSchedulerApp application : activeApplications) {
>> ...
>>         // Schedule in priority order
>>         for (Priority priority : application.getPriorities()) {
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>> (0x4FE8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>> (0xDFC0)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>> (0x741B8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> Thanks,
>> Kishore
>>
>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Wangda,
  Thanks for the reply, here are the details, please see if you could
suggest anything.

1) Number of nodes and running app in the cluster
2 nodes, and I am running my own application that keeps asking for
containers,
a) running something on the containers,
b) releasing the containers,
c) ask for more containers with incremented priority value, and repeat the
same process

2) What's the version of your Hadoop?
apache hadoop-2.4.0

3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
No

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50




On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
> To get more understanding about the problem, could you please share
> following information:
> 1) Number of nodes and running app in the cluster
> 2) What's the version of your Hadoop?
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
>
> Thanks,
> Wangda Tan
>
>
>
> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>   My YARN resource manager is consuming 100% CPU when I am running an
>> application that is running for about 10 hours, requesting as many as 27000
>> containers. The CPU consumption was very low at the starting of my
>> application, and it gradually went high to over 100%. Is this a known issue
>> or are we doing something wrong?
>>
>> Every dump of the EVent Processor thread is running
>> LeafQueue::assignContainers() specifically the for loop below from
>> LeafQueue.java and seems to be looping through some priority list.
>>
>>     // Try to assign containers to applications in order
>>     for (FiCaSchedulerApp application : activeApplications) {
>> ...
>>         // Schedule in priority order
>>         for (Priority priority : application.getPriorities()) {
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>> (0x4FE8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>> (0xDFC0)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>> (0x741B8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> Thanks,
>> Kishore
>>
>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Wangda,
  Thanks for the reply, here are the details, please see if you could
suggest anything.

1) Number of nodes and running app in the cluster
2 nodes, and I am running my own application that keeps asking for
containers,
a) running something on the containers,
b) releasing the containers,
c) ask for more containers with incremented priority value, and repeat the
same process

2) What's the version of your Hadoop?
apache hadoop-2.4.0

3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
No

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50




On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
> To get more understanding about the problem, could you please share
> following information:
> 1) Number of nodes and running app in the cluster
> 2) What's the version of your Hadoop?
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
>
> Thanks,
> Wangda Tan
>
>
>
> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>   My YARN resource manager is consuming 100% CPU when I am running an
>> application that is running for about 10 hours, requesting as many as 27000
>> containers. The CPU consumption was very low at the starting of my
>> application, and it gradually went high to over 100%. Is this a known issue
>> or are we doing something wrong?
>>
>> Every dump of the EVent Processor thread is running
>> LeafQueue::assignContainers() specifically the for loop below from
>> LeafQueue.java and seems to be looping through some priority list.
>>
>>     // Try to assign containers to applications in order
>>     for (FiCaSchedulerApp application : activeApplications) {
>> ...
>>         // Schedule in priority order
>>         for (Priority priority : application.getPriorities()) {
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>> (0x4FE8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>> (0xDFC0)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>> (0x741B8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> Thanks,
>> Kishore
>>
>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Krishna Kishore Bonagiri <wr...@gmail.com>.
Hi Wangda,
  Thanks for the reply, here are the details, please see if you could
suggest anything.

1) Number of nodes and running app in the cluster
2 nodes, and I am running my own application that keeps asking for
containers,
a) running something on the containers,
b) releasing the containers,
c) ask for more containers with incremented priority value, and repeat the
same process

2) What's the version of your Hadoop?
apache hadoop-2.4.0

3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
No

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50




On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wh...@gmail.com> wrote:

> Hi Krishna,
> To get more understanding about the problem, could you please share
> following information:
> 1) Number of nodes and running app in the cluster
> 2) What's the version of your Hadoop?
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
>
> Thanks,
> Wangda Tan
>
>
>
> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>   My YARN resource manager is consuming 100% CPU when I am running an
>> application that is running for about 10 hours, requesting as many as 27000
>> containers. The CPU consumption was very low at the starting of my
>> application, and it gradually went high to over 100%. Is this a known issue
>> or are we doing something wrong?
>>
>> Every dump of the EVent Processor thread is running
>> LeafQueue::assignContainers() specifically the for loop below from
>> LeafQueue.java and seems to be looping through some priority list.
>>
>>     // Try to assign containers to applications in order
>>     for (FiCaSchedulerApp application : activeApplications) {
>> ...
>>         // Schedule in priority order
>>         for (Priority priority : application.getPriorities()) {
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>> (0x4FE8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>> (0xDFC0)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>> (0x741B8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> Thanks,
>> Kishore
>>
>
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,
To get more understanding about the problem, could you please share
following information:
1) Number of nodes and running app in the cluster
2) What's the version of your Hadoop?
3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?

Thanks,
Wangda Tan



On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi,
>   My YARN resource manager is consuming 100% CPU when I am running an
> application that is running for about 10 hours, requesting as many as 27000
> containers. The CPU consumption was very low at the starting of my
> application, and it gradually went high to over 100%. Is this a known issue
> or are we doing something wrong?
>
> Every dump of the EVent Processor thread is running
> LeafQueue::assignContainers() specifically the for loop below from
> LeafQueue.java and seems to be looping through some priority list.
>
>     // Try to assign containers to applications in order
>     for (FiCaSchedulerApp application : activeApplications) {
> ...
>         // Schedule in priority order
>         for (Priority priority : application.getPriorities()) {
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
> (0x4FE8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
> (0xDFC0)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
> (0x741B8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
> Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> Thanks,
> Kishore
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,
To get more understanding about the problem, could you please share
following information:
1) Number of nodes and running app in the cluster
2) What's the version of your Hadoop?
3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?

Thanks,
Wangda Tan



On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi,
>   My YARN resource manager is consuming 100% CPU when I am running an
> application that is running for about 10 hours, requesting as many as 27000
> containers. The CPU consumption was very low at the starting of my
> application, and it gradually went high to over 100%. Is this a known issue
> or are we doing something wrong?
>
> Every dump of the EVent Processor thread is running
> LeafQueue::assignContainers() specifically the for loop below from
> LeafQueue.java and seems to be looping through some priority list.
>
>     // Try to assign containers to applications in order
>     for (FiCaSchedulerApp application : activeApplications) {
> ...
>         // Schedule in priority order
>         for (Priority priority : application.getPriorities()) {
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
> (0x4FE8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
> (0xDFC0)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
> (0x741B8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
> Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> Thanks,
> Kishore
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,
To get more understanding about the problem, could you please share
following information:
1) Number of nodes and running app in the cluster
2) What's the version of your Hadoop?
3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?

Thanks,
Wangda Tan



On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi,
>   My YARN resource manager is consuming 100% CPU when I am running an
> application that is running for about 10 hours, requesting as many as 27000
> containers. The CPU consumption was very low at the starting of my
> application, and it gradually went high to over 100%. Is this a known issue
> or are we doing something wrong?
>
> Every dump of the EVent Processor thread is running
> LeafQueue::assignContainers() specifically the for loop below from
> LeafQueue.java and seems to be looping through some priority list.
>
>     // Try to assign containers to applications in order
>     for (FiCaSchedulerApp application : activeApplications) {
> ...
>         // Schedule in priority order
>         for (Priority priority : application.getPriorities()) {
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
> (0x4FE8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
> (0xDFC0)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
> (0x741B8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
> Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> Thanks,
> Kishore
>

Re: 100% CPU consumption by Resource Manager process

Posted by Wangda Tan <wh...@gmail.com>.
Hi Krishna,
To get more understanding about the problem, could you please share
following information:
1) Number of nodes and running app in the cluster
2) What's the version of your Hadoop?
3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?

Thanks,
Wangda Tan



On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
write2kishore@gmail.com> wrote:

> Hi,
>   My YARN resource manager is consuming 100% CPU when I am running an
> application that is running for about 10 hours, requesting as many as 27000
> containers. The CPU consumption was very low at the starting of my
> application, and it gradually went high to over 100%. Is this a known issue
> or are we doing something wrong?
>
> Every dump of the EVent Processor thread is running
> LeafQueue::assignContainers() specifically the for loop below from
> LeafQueue.java and seems to be looping through some priority list.
>
>     // Try to assign containers to applications in order
>     for (FiCaSchedulerApp application : activeApplications) {
> ...
>         // Schedule in priority order
>         for (Priority priority : application.getPriorities()) {
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
> (0x4FE8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
> (0xDFC0)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> 3XMTHREADINFO      "ResourceManager Event Processor"
> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
> native policy:UNKNOWN)
> 3XMTHREADINFO2            (native stack address range
> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
> (0x741B8)
> 3XMTHREADINFO3           Java callstack:
> 4XESTACKTRACE                at
> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
> Code))
> 4XESTACKTRACE                at
> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
> entry count: 1)
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 2)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
> Code))
> 5XESTACKTRACE                   (entered lock:
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
> entry count: 1)
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
> Code))
> 4XESTACKTRACE                at
> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>
> Thanks,
> Kishore
>