You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Stuart (JIRA)" <ji...@apache.org> on 2017/12/13 12:18:00 UTC

[jira] [Created] (STORM-2853) Deactivated topologies cause high cpu utilization

Stuart created STORM-2853:
-----------------------------

             Summary: Deactivated topologies cause high cpu utilization
                 Key: STORM-2853
                 URL: https://issues.apache.org/jira/browse/STORM-2853
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-core
    Affects Versions: 1.1.0
            Reporter: Stuart
            Priority: Minor


The issue is there is high cpu usage for deactivated apache storm topologies.  I can reliably re-create the issue using the steps below but I haven't identified the exact cause or a solution yet.

The environment is a storm cluster on which 1 topology is running (The topology is extremely simple, I used the exclamation example).  It is INACTIVE.  Initially there is normal CPU usage.  However, when I kill all topology JVM processes on all supervisors and let Storm restart them again, I find that some time later (~15 minutes) the CPU usage per JVM process rises to nearly 100%.  I have tested an ACTIVE topology and this does not happen with it.  I have also tested more than one topology and observe the same results when they're in the INACTIVE state.

***Steps to re-create:***

 1. Run 1 topology on an Apache Storm cluster
 2. Deactivate it
 3. Kill **all** topology JVM processes on all supervisors (Storm will restart them)
 4. Observe the CPU usage on Supervisors rise to nearly 100% for all **INACTIVE** topology JVM processes.

***Environment***

Apache Storm 1.1.0 running on 3 VMs, one nimbus and 2 supervisors.

Cluster Summary:

 - Supervisors: 2 
 - Used Slots: 2 
 - Available Slots: 38 
 - Total Slots: 40
 - Executors: 50 
 - Tasks: 50

the topology has 2 workers and 50 executors/tasks (threads).


***Investigation so far:***

Apart from being able to reliably re-create the issue, I have identified, for the affected topology JVM process, the threads using the most CPU.  There are 102 threads total in the process, 97 blocked, 5 IN_NATIVE.  The threads using the most CPU are identical and there are 23 of them (all in BLOCKED state):

    Thread 28558: (state = BLOCKED)
     - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
     - java.util.concurrent.locks.LockSupport.parkNanos(long) @bci=11, line=338 (Compiled frame)
     - com.lmax.disruptor.MultiProducerSequencer.next(int) @bci=82, line=136 (Compiled frame)
     - com.lmax.disruptor.RingBuffer.next(int) @bci=5, line=260 (Interpreted frame)
     - org.apache.storm.utils.DisruptorQueue.publishDirect(java.util.ArrayList, boolean) @bci=18, line=517 (Interpreted frame)
     - org.apache.storm.utils.DisruptorQueue.access$1000(org.apache.storm.utils.DisruptorQueue, java.util.ArrayList, boolean) @bci=3, line=61 (Interpreted frame)
     - org.apache.storm.utils.DisruptorQueue$ThreadLocalBatcher.flush(boolean) @bci=50, line=280 (Interpreted frame)
     - org.apache.storm.utils.DisruptorQueue$Flusher.run() @bci=55, line=303 (Interpreted frame)
     - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 (Compiled frame)
     - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
     - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Compiled frame)
     - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
     - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)


I identified this thread by using `jstack` to get a thread dump for the process:
 

    jstack -F <pid> > jstack<pid>.txt

and `top` to identify the threads within the process using the most CPU:

    top -H -p <pid> 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)