You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Kirill Kireyev <ki...@instagrok.com> on 2012/09/27 17:38:36 UTC

high CPU usage on tomcat 7

Hi!

I'm periodically getting unduly high (100%) CPU usage by the tomcat 
process on my server. This problems happens intermittently, several 
times a week. When the server goes into this high CPU it does not come 
back (and becomes unresponsive to new requests), and the only recourse 
is to restart the tomcat process.

I'm using Tomcat 7.0.30, with APR and Apache web server, on a Ubuntu 
11.10 server with 32g of RAM / 8 CPUs.

I've done several jstack stack traces when this occurs, and what I 
consistently see, are the connector threads in the RUNNABLE state every 
time, i.e.:

ajp-apr-8009-Acceptor-0" daemon prio=10 tid=0x00000000010a1000 nid=0x539 
runnable [0x00007f9364f8e000]
    java.lang.Thread.State: RUNNABLE
         at org.apache.tomcat.jni.Socket.accept(Native Method)
         at 
org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
         at java.lang.Thread.run(Thread.java:722)

"http-apr-8443-Acceptor-0" daemon prio=10 tid=0x000000000109b800 
nid=0x535 runnable [0x00007f9365510000]
    java.lang.Thread.State: RUNNABLE
         at org.apache.tomcat.jni.Socket.accept(Native Method)
         at 
org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
         at java.lang.Thread.run(Thread.java:722)

"http-apr-8080-Acceptor-0" daemon prio=10 tid=0x00000000015ab000 
nid=0x531 runnable [0x00007f9365a92000]
    java.lang.Thread.State: RUNNABLE
         at org.apache.tomcat.jni.Socket.accept(Native Method)
         at 
org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
         at java.lang.Thread.run(Thread.java:722)

Other threads are in RUNNBLE too in different cases, but these are the 
one that are always there when the high CPU occurs. That's why I'm 
starting to think it has something to do with Tomcat.

Can anyone shed some light on this? My current Connector configurations 
in server.xml are:
  <Connector port="8080" 
protocol="org.apache.coyote.http11.Http11AprProtocol"
                connectionTimeout="20000"
                maxThreads="500" minSpareThreads="10" maxSpareThreads="20"
                redirectPort="8443"
                pollTime="100000" />
...
<Connector port="8443" 
protocol="org.apache.coyote.http11.Http11AprProtocol"
             maxThreads="200" scheme="https" secure="true" 
SSLEnabled="true"
             SSLCACertificateFile="********"
             SSLCertificateKeyFile="********"
             SSLCertificateFile="*******"
             enableLookups="false" clientAuth="false" sslProtocol="TLS"
             pollTime="100000" />
...
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443"
            acceptCount="100" connectionTimeout="5000" 
keepAliveTimeout="20000"
            disableUploadTimeout="true" enableLookups="false"
            maxHttpHeaderSize="8192"
            maxSpareThreads="75" maxThreads="150"
            minSpareThreads="25"
            executor="default" />

Thanks a lot!
-Kirill

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Hi Kirill,

I mistook that the CPU issue clears itself.  Sorry.  It may or may not be
related to Garbage-collection settings then.

                       -Shanti

On Thu, Sep 27, 2012 at 2:17 PM, Shanti Suresh <sh...@umich.edu> wrote:

> Hi Kirill,
>
> Like Mark, Bill and Jeff said, those threads are normal request-processing
> threads.  I have included a script that might help with isolating high CPU
> issues with Tomcat.
>
> Also, I think it might be helpful to see how the Java heap is performing
> as well.
> Please bring up Jconsole and let it run over the week.  Inspect the graphs
> for Memory, CPU and threads.  Since you say that high CPU occurs
> intermittently several times during the week and clears itself, I wonder if
> it is somehow related with the garbage collection options you are using for
> the server.  Or it may be a code-related problem.
>
> Things to look at may include:
>
> (1) Are high CPU times related to Java heap reductions happening at the
> same time?  ==> GC possibly needs tuning
> (2) Are high CPU times related to increase in thread usage?  ==> possible
> livelock in looping code?
> (3) how many network connections come into the Tomcat server during
> high-CPU times?    Possible overload-related?
>
> Here is the script.  I made a couple of small changes, for e.g., changing
> the username.  But didn't test it after the change.  During high-CPU times,
> invoke the script a few times, say 30 seconds apart.  And then compare the
> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
> thread-dumps.
>
> Mark, et al, please feel free to help me refine this script.  I would like
> to have a script to catch STUCK threads too :-)  Let me know if anyone has
> a script already.  Thanks.
>
> --------------high_cpu_diagnostics.pl:-----
> #!/usr/bin/perl
> #
>
> use Cwd;
>
> # Make a dated directory for capturing current diagnostics
> my ($sec,$min,$hour,$mday,$mon,$year,
>           $wday,$yday,$isdst) = localtime time;
> $year += 1900;
> $mon += 1;
> my $pwd = cwd();
> my $preview_diag_dir =
> "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
> print "$preview_diag_dir\n";
> mkdir $preview_diag_dir, 0755;
> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>
> # Capture Preview thread dump
> my $process_pattern = "preview";
> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
> my $login = getpwuid($<) ;
> if (kill 0, $preview_pid){
>         #Possible to send a signal to the Preview Tomcat - either "webinf"
> or "root"
>         my $count = kill 3, $preview_pid;
> }else {
>         # Not possible to send a signal to the VCM - use "sudo"
>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
> }
>
> # Capture Preview thread dump
> system ("/usr/bin/jmap
> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>
> # Gather the top threads; keep around for reference on what other threads
> are running
> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
> system("@top_cmd 1> top_all_threads.log");
>
> # Get your tomcat user's threads, i.e. threads of user, "webinf"
> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9"
> | /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>
> # Get the thread dump
> my @output=`/usr/bin/jstack -l ${preview_pid}`;
> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open
> file: $!";
> print $file @output;
> close $file;
>
> open LOG, "top_user_webinf_threads.log" or die $!;
> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
> print "PID\tCPU\tMem\tJStack Info\n";
> while ($l = <LOG>) {
>     chop $l;
>     $pid = $l;
>     $pid =~ s/webinf.*//g;
>     $pid =~ s/ *//g;
> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>     $hex_pid = sprintf("%#x", $pid);
>     @values = split(/\s+/, $l);
>     $pct = $values[8];
>     $mem = $values[9];
> # Debugger breakpoint:
> $DB::single = 1;
>
> # Find the Java thread that corresponds to the thread-id from the TOP
> output
>     for my $j (@output) {
>         chop $j;
>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
> $mem . "\t" .  $j . "\n";
>     }
> }
>
> close (STDOUT);
>
> close LOG;
>
>
> ------------------end of script --------------------------
>
> Thanks.
>
>                       -Shanti
>
>
> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
> millebi.subscriptions@gmail.com> wrote:
>
>> I agree; we have reproducible instances where PermGen is not set to our
>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>> every time. Do some JMX monitoring and you may discover a memory spike
>> that's killing Tomcat.
>>
>> Bill
>> -----Original Message-----
>> From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>> Sent: September-27-2012 2:01 PM
>> To: Tomcat Users List
>> Subject: Re: high CPU usage on tomcat 7
>>
>> This is probably due to out of memory, I have the same problem on my
>> ubuntu
>> ci machine Did you monitor your tomcat with jmx ?
>>
>> Jeff
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>

Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Hi Kirill,

Please feel free to obfuscate username, service name, filesystem directory
names, temp-dir names etc. in the command-line output.  Mostly interesting
things would be GC-related options, JVM heap sizing options, class-loading
options, other Java options.  But please paste the whole command-line from
the "ps" command.

Thanks.

                            -Shanti

On Wed, Oct 3, 2012 at 11:10 AM, Shanti Suresh <sh...@umich.edu> wrote:

> Well,
>
> Here it is:
>
> ----------------start_jconsole.sh-----------
> #!/bin/sh
>
> export DISPLAY
>
>
> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
> && print $1'`
> echo $DISPLAY_NUMBER
>
> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
> sudo su webinf -c "xauth add $XAUTH_ENTRY; /usr/bin/jconsole
> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java
> /lib/tools.jar"
>
> ----------------------------------------------------
>
>
>
> On Wed, Oct 3, 2012 at 11:07 AM, Shanti Suresh <sh...@umich.edu> wrote:
>
>> Hi Kirill,
>>
>> I am amazed at the size of the Java Heap of 26 GB.  I didn't expect that
>> big a size.  Before I suggest anything else, please take a look at the
>> filesystem space on "/tmp".  The script creates a heapdump everytime it
>> runs as well.  So please make sure that "/tmp/" has sufficient storage.
>> Otherwise it may bring your service down.
>>
>> (1) Could you please paste your JVM command-line arguments here?
>> (2) Also, did you have a chance to observe the JConsole graphs?   I have
>> attached a sample JConsole graph here.
>> (3) What JDK version are you using?
>> (4) GC for a Heap of that size may need some tuning.
>>
>> You may start JConsole as follows on Linux/Unix.
>> a.Place the following script in your home-directory on the Tomcat
>> app-server machine .
>> b. Make sure that you have enabled Tomcat to allow JMX access.
>> c. Then log-in from a remote laptop/desktop into the server (using ssh),
>> and invoke this script from your home-directory:
>> --------start_jconsole.sh------
>> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
>> && print $1'`
>> echo $DISPLAY_NUMBER
>> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
>> /usr/bin/jconsole
>> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
>> -------------------------------
>>
>>
>> [image: Inline image 1]
>>
>> Thanks.
>>
>>                          -Shanti
>>
>>
>> On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <ki...@instagrok.com>wrote:
>>
>>>  Hi Shanti! Thanks for your script! I ran it (with minor
>>> modifications), and it turns out that the garbage collection threads are
>>> indeed taking up the majority of CPU (see below). What would you recommend
>>> I do to tune/troubleshoot GC further? My server has 32G or RAM, JVM has
>>> heap size of 26G
>>>
>>> Thanks,
>>> Kirill
>>>
>>> PID    CPU    Mem    JStack Info
>>> 0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
>>> tid=0x0000000001938800 nid=0x253d runnable
>>> 0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
>>> tid=0x000000000193e000 nid=0x2540 runnable
>>> 0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
>>> tid=0x0000000001934800 nid=0x253b runnable
>>> 0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
>>> tid=0x000000000193a000 nid=0x253e runnable
>>> 0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
>>> tid=0x000000000193f800 nid=0x2541 runnable
>>> 0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
>>> tid=0x0000000001936800 nid=0x253c runnable
>>> 0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
>>> tid=0x0000000001933000 nid=0x253a runnable
>>> 0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
>>> tid=0x000000000193c000 nid=0x253f runnable
>>> 0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000
>>> nid=0x2542 runnable
>>> 0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000 nid=0x2539
>>> runnable [0x00007fee9d444000]
>>> 0x2543    0    90.7    "Reference Handler" daemon prio=10
>>> tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
>>> 0x2544    0    90.7    "Finalizer" daemon prio=10 tid=0x00000000019c5800
>>> nid=0x2544 in Object.wait() [0x00007fee4ffd7000]
>>> 0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
>>> tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
>>> 0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
>>> tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000]
>>> 0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
>>> tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000]
>>> 0x2548    0    90.7    "Service Thread" daemon prio=10
>>> tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
>>> 0x2549    0    90.7    "VM Periodic Task Thread" prio=10
>>> tid=0x0000000001a1f800 nid=0x2549 waiting on condition
>>> 0x254a    0    90.7    "GC Daemon" daemon prio=10 tid=0x00007fee404b3000
>>> nid=0x254a in Object.wait() [0x00007fee4e1bd000]
>>> 0x254d    0    90.7
>>> "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10
>>> tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000]
>>> 0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
>>> tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000]
>>> 0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon prio=10
>>> tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000]
>>> 0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
>>> tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000]
>>> 0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
>>> tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
>>> 0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10
>>> tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
>>> 0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
>>> tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000]
>>> 0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon prio=10
>>> tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000]
>>> 0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
>>> tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
>>> 0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10
>>> tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
>>> 0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
>>> tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
>>> 0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10
>>> tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000]
>>> 0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
>>> tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
>>> 0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
>>> tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
>>> 0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
>>> tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
>>> 0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
>>> tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
>>> 0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
>>> tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
>>> 0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
>>> tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
>>> 0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
>>> tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
>>> 0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
>>> tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000]
>>> 0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
>>> tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000]
>>> 0x2587    0    90.7    "pool-1-thread-1" prio=10 tid=0x0000000001fe4000
>>> nid=0x2587 waiting on condition [0x00007fee47af3000]
>>> 0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
>>> tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
>>> 0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
>>> tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000]
>>> 0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
>>> tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000]
>>> 0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
>>> tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000]
>>> 0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
>>> tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
>>> 0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
>>> tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000]
>>> 0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
>>> tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000]
>>> 0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
>>> tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000]
>>> 0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
>>> tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000]
>>> 0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
>>> tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000]
>>> 0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
>>> tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000]
>>> 0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
>>> tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000]
>>> 0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
>>> tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000]
>>> 0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
>>> tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000]
>>> 0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
>>> tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000]
>>> 0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
>>> tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000]
>>> 0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
>>> tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000]
>>> 0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
>>> tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000]
>>> 0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
>>> tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000]
>>> 0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
>>> tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
>>> 0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
>>> tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000]
>>> 0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
>>> tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000]
>>> 0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
>>> tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000]
>>> 0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
>>> tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000]
>>> 0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
>>> tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000]
>>> 0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
>>> tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000]
>>> 0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
>>> tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
>>> 0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
>>> tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000]
>>> 0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
>>> tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
>>> 0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
>>> tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000]
>>> 0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
>>> tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000]
>>> 0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
>>> tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
>>> 0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
>>> tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
>>> 0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
>>> tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
>>> 0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
>>> tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
>>> 0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
>>> tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000]
>>> 0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
>>> tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000]
>>> 0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
>>> tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
>>> 0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
>>> tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
>>> 0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
>>> tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000]
>>>
>>>
>>>
>>> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>>
>>> Hi Kirill,
>>>
>>> Like Mark, Bill and Jeff said, those threads are normal request-processing
>>> threads.  I have included a script that might help with isolating high CPU
>>> issues with Tomcat.
>>>
>>> Also, I think it might be helpful to see how the Java heap is performing as
>>> well.
>>> Please bring up Jconsole and let it run over the week.  Inspect the graphs
>>> for Memory, CPU and threads.  Since you say that high CPU occurs
>>> intermittently several times during the week and clears itself, I wonder if
>>> it is somehow related with the garbage collection options you are using for
>>> the server.  Or it may be a code-related problem.
>>>
>>> Things to look at may include:
>>>
>>> (1) Are high CPU times related to Java heap reductions happening at the
>>> same time?  ==> GC possibly needs tuning
>>> (2) Are high CPU times related to increase in thread usage?  ==> possible
>>> livelock in looping code?
>>> (3) how many network connections come into the Tomcat server during
>>> high-CPU times?    Possible overload-related?
>>>
>>> Here is the script.  I made a couple of small changes, for e.g., changing
>>> the username.  But didn't test it after the change.  During high-CPU times,
>>> invoke the script a few times, say 30 seconds apart.  And then compare the
>>> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>>> thread-dumps.
>>>
>>> Mark, et al, please feel free to help me refine this script.  I would like
>>> to have a script to catch STUCK threads too :-)  Let me know if anyone has
>>> a script already.  Thanks.
>>>
>>> --------------high_cpu_diagnostics.pl:-----
>>> #!/usr/bin/perl
>>> #
>>>
>>> use Cwd;
>>>
>>> # Make a dated directory for capturing current diagnostics
>>> my ($sec,$min,$hour,$mday,$mon,$year,
>>>           $wday,$yday,$isdst) = localtime time;
>>> $year += 1900;
>>> $mon += 1;
>>> my $pwd = cwd();
>>> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>>> print "$preview_diag_dir\n";
>>> mkdir $preview_diag_dir, 0755;
>>> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>>>
>>> # Capture Preview thread dump
>>> my $process_pattern = "preview";
>>> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>>> my $login = getpwuid($<) ;
>>> if (kill 0, $preview_pid){
>>>         #Possible to send a signal to the Preview Tomcat - either "webinf"
>>> or "root"
>>>         my $count = kill 3, $preview_pid;
>>> }else {
>>>         # Not possible to send a signal to the VCM - use "sudo"
>>>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>>> }
>>>
>>> # Capture Preview thread dump
>>> system ("/usr/bin/jmap
>>> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>>>
>>> # Gather the top threads; keep around for reference on what other threads
>>> are running
>>> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>>> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>>> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>>> system("@top_cmd 1> top_all_threads.log");
>>>
>>> # Get your tomcat user's threads, i.e. threads of user, "webinf"
>>> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>>> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>>
>>> # Get the thread dump
>>> my @output=`/usr/bin/jstack -l ${preview_pid}`;
>>> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>>> $!";
>>> print $file @output;
>>> close $file;
>>>
>>> open LOG, "top_user_webinf_threads.log" or die $!;
>>> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>>> print "PID\tCPU\tMem\tJStack Info\n";
>>> while ($l = <LOG>) {
>>>     chop $l;
>>>     $pid = $l;
>>>     $pid =~ s/webinf.*//g;
>>>     $pid =~ s/ *//g;
>>> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>>     $hex_pid = sprintf("%#x", $pid);
>>>     @values = split(/\s+/, $l);
>>>     $pct = $values[8];
>>>     $mem = $values[9];
>>> # Debugger breakpoint:
>>> $DB::single = 1;
>>>
>>> # Find the Java thread that corresponds to the thread-id from the TOP output
>>>     for my $j (@output) {
>>>         chop $j;
>>>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
>>> $mem . "\t" .  $j . "\n";
>>>     }
>>> }
>>>
>>> close (STDOUT);
>>>
>>> close LOG;
>>>
>>>
>>> ------------------end of script --------------------------
>>>
>>> Thanks.
>>>
>>>                       -Shanti
>>>
>>>
>>> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <mi...@gmail.com> wrote:
>>>
>>>
>>>  I agree; we have reproducible instances where PermGen is not set to our
>>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>>> every time. Do some JMX monitoring and you may discover a memory spike
>>> that's killing Tomcat.
>>>
>>> Bill
>>> -----Original Message-----
>>> From: Jeff MAURY [mailto:jeffmaury@gmail.com <je...@gmail.com>]
>>> Sent: September-27-2012 2:01 PM
>>> To: Tomcat Users List
>>> Subject: Re: high CPU usage on tomcat 7
>>>
>>> This is probably due to out of memory, I have the same problem on my ubuntu
>>> ci machine Did you monitor your tomcat with jmx ?
>>>
>>> Jeff
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>
>>>
>>>
>>> --
>>> *Kirill Kireyev, PhD*
>>> Founder/CTO instaGrok.com <http://www.instagrok.com>
>>> kirill@instagrok.com
>>> Twitter: @instaGrok <http://twitter.com/InstaGrok>
>>> FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>>>  <http://www.instagrok.com>
>>>
>>
>>
>

Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Well,

Here it is:

----------------start_jconsole.sh-----------
#!/bin/sh

export DISPLAY

DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
&& print $1'`
echo $DISPLAY_NUMBER

XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
sudo su webinf -c "xauth add $XAUTH_ENTRY; /usr/bin/jconsole
-J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java
/lib/tools.jar"

----------------------------------------------------


On Wed, Oct 3, 2012 at 11:07 AM, Shanti Suresh <sh...@umich.edu> wrote:

> Hi Kirill,
>
> I am amazed at the size of the Java Heap of 26 GB.  I didn't expect that
> big a size.  Before I suggest anything else, please take a look at the
> filesystem space on "/tmp".  The script creates a heapdump everytime it
> runs as well.  So please make sure that "/tmp/" has sufficient storage.
> Otherwise it may bring your service down.
>
> (1) Could you please paste your JVM command-line arguments here?
> (2) Also, did you have a chance to observe the JConsole graphs?   I have
> attached a sample JConsole graph here.
> (3) What JDK version are you using?
> (4) GC for a Heap of that size may need some tuning.
>
> You may start JConsole as follows on Linux/Unix.
> a.Place the following script in your home-directory on the Tomcat
> app-server machine .
> b. Make sure that you have enabled Tomcat to allow JMX access.
> c. Then log-in from a remote laptop/desktop into the server (using ssh),
> and invoke this script from your home-directory:
> --------start_jconsole.sh------
> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
> && print $1'`
> echo $DISPLAY_NUMBER
> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
> /usr/bin/jconsole
> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
> -------------------------------
>
>
> [image: Inline image 1]
>
> Thanks.
>
>                          -Shanti
>
>
> On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <ki...@instagrok.com>wrote:
>
>>  Hi Shanti! Thanks for your script! I ran it (with minor modifications),
>> and it turns out that the garbage collection threads are indeed taking up
>> the majority of CPU (see below). What would you recommend I do to
>> tune/troubleshoot GC further? My server has 32G or RAM, JVM has heap size
>> of 26G
>>
>> Thanks,
>> Kirill
>>
>> PID    CPU    Mem    JStack Info
>> 0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
>> tid=0x0000000001938800 nid=0x253d runnable
>> 0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
>> tid=0x000000000193e000 nid=0x2540 runnable
>> 0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
>> tid=0x0000000001934800 nid=0x253b runnable
>> 0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
>> tid=0x000000000193a000 nid=0x253e runnable
>> 0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
>> tid=0x000000000193f800 nid=0x2541 runnable
>> 0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
>> tid=0x0000000001936800 nid=0x253c runnable
>> 0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
>> tid=0x0000000001933000 nid=0x253a runnable
>> 0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
>> tid=0x000000000193c000 nid=0x253f runnable
>> 0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000
>> nid=0x2542 runnable
>> 0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000 nid=0x2539
>> runnable [0x00007fee9d444000]
>> 0x2543    0    90.7    "Reference Handler" daemon prio=10
>> tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
>> 0x2544    0    90.7    "Finalizer" daemon prio=10 tid=0x00000000019c5800
>> nid=0x2544 in Object.wait() [0x00007fee4ffd7000]
>> 0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
>> tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
>> 0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
>> tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000]
>> 0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
>> tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000]
>> 0x2548    0    90.7    "Service Thread" daemon prio=10
>> tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
>> 0x2549    0    90.7    "VM Periodic Task Thread" prio=10
>> tid=0x0000000001a1f800 nid=0x2549 waiting on condition
>> 0x254a    0    90.7    "GC Daemon" daemon prio=10 tid=0x00007fee404b3000
>> nid=0x254a in Object.wait() [0x00007fee4e1bd000]
>> 0x254d    0    90.7
>> "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10
>> tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000]
>> 0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
>> tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000]
>> 0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon prio=10
>> tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000]
>> 0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
>> tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000]
>> 0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
>> tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
>> 0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10
>> tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
>> 0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
>> tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000]
>> 0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon prio=10
>> tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000]
>> 0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
>> tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
>> 0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10
>> tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
>> 0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
>> tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
>> 0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10
>> tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000]
>> 0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
>> tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
>> 0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
>> tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
>> 0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
>> tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
>> 0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
>> tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
>> 0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
>> tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
>> 0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
>> tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
>> 0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
>> tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
>> 0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
>> tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000]
>> 0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
>> tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000]
>> 0x2587    0    90.7    "pool-1-thread-1" prio=10 tid=0x0000000001fe4000
>> nid=0x2587 waiting on condition [0x00007fee47af3000]
>> 0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
>> tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
>> 0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
>> tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000]
>> 0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
>> tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000]
>> 0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
>> tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000]
>> 0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
>> tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
>> 0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
>> tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000]
>> 0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
>> tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000]
>> 0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
>> tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000]
>> 0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
>> tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000]
>> 0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
>> tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000]
>> 0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
>> tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000]
>> 0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
>> tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000]
>> 0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
>> tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000]
>> 0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
>> tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000]
>> 0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
>> tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000]
>> 0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
>> tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000]
>> 0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
>> tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000]
>> 0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
>> tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000]
>> 0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
>> tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000]
>> 0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
>> tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
>> 0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
>> tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000]
>> 0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
>> tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000]
>> 0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
>> tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000]
>> 0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
>> tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000]
>> 0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
>> tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000]
>> 0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
>> tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000]
>> 0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
>> tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
>> 0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
>> tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000]
>> 0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
>> tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
>> 0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
>> tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000]
>> 0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
>> tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000]
>> 0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
>> tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
>> 0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
>> tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
>> 0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
>> tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
>> 0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
>> tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
>> 0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
>> tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000]
>> 0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
>> tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000]
>> 0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
>> tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
>> 0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
>> tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
>> 0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
>> tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000]
>>
>>
>>
>> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>
>> Hi Kirill,
>>
>> Like Mark, Bill and Jeff said, those threads are normal request-processing
>> threads.  I have included a script that might help with isolating high CPU
>> issues with Tomcat.
>>
>> Also, I think it might be helpful to see how the Java heap is performing as
>> well.
>> Please bring up Jconsole and let it run over the week.  Inspect the graphs
>> for Memory, CPU and threads.  Since you say that high CPU occurs
>> intermittently several times during the week and clears itself, I wonder if
>> it is somehow related with the garbage collection options you are using for
>> the server.  Or it may be a code-related problem.
>>
>> Things to look at may include:
>>
>> (1) Are high CPU times related to Java heap reductions happening at the
>> same time?  ==> GC possibly needs tuning
>> (2) Are high CPU times related to increase in thread usage?  ==> possible
>> livelock in looping code?
>> (3) how many network connections come into the Tomcat server during
>> high-CPU times?    Possible overload-related?
>>
>> Here is the script.  I made a couple of small changes, for e.g., changing
>> the username.  But didn't test it after the change.  During high-CPU times,
>> invoke the script a few times, say 30 seconds apart.  And then compare the
>> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>> thread-dumps.
>>
>> Mark, et al, please feel free to help me refine this script.  I would like
>> to have a script to catch STUCK threads too :-)  Let me know if anyone has
>> a script already.  Thanks.
>>
>> --------------high_cpu_diagnostics.pl:-----
>> #!/usr/bin/perl
>> #
>>
>> use Cwd;
>>
>> # Make a dated directory for capturing current diagnostics
>> my ($sec,$min,$hour,$mday,$mon,$year,
>>           $wday,$yday,$isdst) = localtime time;
>> $year += 1900;
>> $mon += 1;
>> my $pwd = cwd();
>> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>> print "$preview_diag_dir\n";
>> mkdir $preview_diag_dir, 0755;
>> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>>
>> # Capture Preview thread dump
>> my $process_pattern = "preview";
>> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>> my $login = getpwuid($<) ;
>> if (kill 0, $preview_pid){
>>         #Possible to send a signal to the Preview Tomcat - either "webinf"
>> or "root"
>>         my $count = kill 3, $preview_pid;
>> }else {
>>         # Not possible to send a signal to the VCM - use "sudo"
>>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>> }
>>
>> # Capture Preview thread dump
>> system ("/usr/bin/jmap
>> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>>
>> # Gather the top threads; keep around for reference on what other threads
>> are running
>> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>> system("@top_cmd 1> top_all_threads.log");
>>
>> # Get your tomcat user's threads, i.e. threads of user, "webinf"
>> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>
>> # Get the thread dump
>> my @output=`/usr/bin/jstack -l ${preview_pid}`;
>> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>> $!";
>> print $file @output;
>> close $file;
>>
>> open LOG, "top_user_webinf_threads.log" or die $!;
>> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>> print "PID\tCPU\tMem\tJStack Info\n";
>> while ($l = <LOG>) {
>>     chop $l;
>>     $pid = $l;
>>     $pid =~ s/webinf.*//g;
>>     $pid =~ s/ *//g;
>> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>     $hex_pid = sprintf("%#x", $pid);
>>     @values = split(/\s+/, $l);
>>     $pct = $values[8];
>>     $mem = $values[9];
>> # Debugger breakpoint:
>> $DB::single = 1;
>>
>> # Find the Java thread that corresponds to the thread-id from the TOP output
>>     for my $j (@output) {
>>         chop $j;
>>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
>> $mem . "\t" .  $j . "\n";
>>     }
>> }
>>
>> close (STDOUT);
>>
>> close LOG;
>>
>>
>> ------------------end of script --------------------------
>>
>> Thanks.
>>
>>                       -Shanti
>>
>>
>> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <mi...@gmail.com> wrote:
>>
>>
>>  I agree; we have reproducible instances where PermGen is not set to our
>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>> every time. Do some JMX monitoring and you may discover a memory spike
>> that's killing Tomcat.
>>
>> Bill
>> -----Original Message-----
>> From: Jeff MAURY [mailto:jeffmaury@gmail.com <je...@gmail.com>]
>> Sent: September-27-2012 2:01 PM
>> To: Tomcat Users List
>> Subject: Re: high CPU usage on tomcat 7
>>
>> This is probably due to out of memory, I have the same problem on my ubuntu
>> ci machine Did you monitor your tomcat with jmx ?
>>
>> Jeff
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>>
>> --
>> *Kirill Kireyev, PhD*
>> Founder/CTO instaGrok.com <http://www.instagrok.com>
>> kirill@instagrok.com
>> Twitter: @instaGrok <http://twitter.com/InstaGrok>
>> FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>>  <http://www.instagrok.com>
>>
>
>

Re: high CPU usage on tomcat 7

Posted by Jeff MAURY <je...@jeffmaury.com>.
Aren't you're clients polling the server, this may cause the session not to
expire even if the user is not using the UI ?

Jeff


On Sat, Oct 6, 2012 at 9:42 PM, Mark Thomas <ma...@apache.org> wrote:

>
>
> Kirill Kireyev <ki...@instagrok.com> wrote:
>
> >Thanks for all your thorough advice Shanti! (and everyone else)
> >
> >Here are my findings so far:
> >0) My servlet sessions store a large number (~10s of M) or data in RAM.
> >
> >This is by design, to optimize performance. I can also have ~3K active
> >sessions at any one time. Hence a large heap size.
> >1) When I (1) manually expire inactive sessions through Tomcat web
> >interface and (2) manually hit "Perform GC" through jvisualvm console
> >attached to the tomcat process, everything works great, and the memory
> >is successfully reclaimed, and the used heap  size drops back to what
> >it
> >should be when the application initializes.
> >2) However, this doesn't seem to work automatically. More specifically:
> >     a) Sessions are not expiring without manually doing it (I can see
> >the number growing in the Tomcat web interface). Even though my
> >conf/web.xml says:
> >     <session-config>
> >         <session-timeout>20</session-timeout>
> >     </session-config>
> >    b) A full garbage collection is not being performed unless I do it
> >manually. I'm attaching the GC records from my logs/catalina.out.
> >
> >Any insights?
>
> Session expiration is performed by the background processing thread. What
> is that thread doing (a thread dump - well several over time - will tell
> you).
>
> Fix the session expiration issue and the GC issue will be solved. Note the
> JVM may well not perform a full GC unless it absolutely has to. In fact,
> the JVM may not perform any GC if it doesn't have to.
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>


-- 
Jeff MAURY


"Legacy code" often differs from its suggested alternative by actually
working and scaling.
 - Bjarne Stroustrup

http://www.jeffmaury.com
http://riadiscuss.jeffmaury.com
http://www.twitter.com/jeffmaury

Re: high CPU usage on tomcat 7

Posted by Mark Thomas <ma...@apache.org>.

Kirill Kireyev <ki...@instagrok.com> wrote:

>Thanks for all your thorough advice Shanti! (and everyone else)
>
>Here are my findings so far:
>0) My servlet sessions store a large number (~10s of M) or data in RAM.
>
>This is by design, to optimize performance. I can also have ~3K active 
>sessions at any one time. Hence a large heap size.
>1) When I (1) manually expire inactive sessions through Tomcat web 
>interface and (2) manually hit "Perform GC" through jvisualvm console 
>attached to the tomcat process, everything works great, and the memory 
>is successfully reclaimed, and the used heap  size drops back to what
>it 
>should be when the application initializes.
>2) However, this doesn't seem to work automatically. More specifically:
>     a) Sessions are not expiring without manually doing it (I can see 
>the number growing in the Tomcat web interface). Even though my 
>conf/web.xml says:
>     <session-config>
>         <session-timeout>20</session-timeout>
>     </session-config>
>    b) A full garbage collection is not being performed unless I do it 
>manually. I'm attaching the GC records from my logs/catalina.out.
>
>Any insights?

Session expiration is performed by the background processing thread. What is that thread doing (a thread dump - well several over time - will tell you).

Fix the session expiration issue and the GC issue will be solved. Note the JVM may well not perform a full GC unless it absolutely has to. In fact, the JVM may not perform any GC if it doesn't have to.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: high CPU usage on tomcat 7

Posted by Kirill Kireyev <ki...@instagrok.com>.
Thanks for all your thorough advice Shanti! (and everyone else)

Here are my findings so far:
0) My servlet sessions store a large number (~10s of M) or data in RAM. 
This is by design, to optimize performance. I can also have ~3K active 
sessions at any one time. Hence a large heap size.
1) When I (1) manually expire inactive sessions through Tomcat web 
interface and (2) manually hit "Perform GC" through jvisualvm console 
attached to the tomcat process, everything works great, and the memory 
is successfully reclaimed, and the used heap  size drops back to what it 
should be when the application initializes.
2) However, this doesn't seem to work automatically. More specifically:
     a) Sessions are not expiring without manually doing it (I can see 
the number growing in the Tomcat web interface). Even though my 
conf/web.xml says:
     <session-config>
         <session-timeout>20</session-timeout>
     </session-config>
     b) A full garbage collection is not being performed unless I do it 
manually. I'm attaching the GC records from my logs/catalina.out.

Any insights?

Thanks a lot!
Kirill


On 10/5/12 7:40 AM, Shanti Suresh wrote:
> Hi Kirill,
>
> Do you happen to have a QA or a test environment by any chance?
>
> 25GB is very big for a JVM heap size.
>
> Some questions I have are:
> (1) What does the application do?   How many concurrent users do you have?
> (2) What was the heap set at before you increased it to 25GB?
> (3) Why did you think you needed 25GB?  Is this just an arbitrary size
> because your machine has 32GB RAM total and you wanted to allocate out the
> JVM size to maximum available?
> (4) What symptoms did you face before you increased the heap to 25 GB?
> (5) What size are the heapdumps that the earlier script generated?  It
> would be a challenge to troubleshoot a heapdump of 20 - 25 GB.
>
> Suggestions for diagnostic steps:
> (1) The first step, I think, is to look at the JConsole memory use graph
> after application restart and see if the memory is rising linearly
> throughout the application's use.  You seem to have a memory leak like
> Charles pointed out.
> (2) Restart QA.  Bring up JConsole.  Have only one user use the system.
> Study JConsole Overview panel - Memory, CPU, threads, #loaded classes.
> (3) Take a heapdump.
> (4) Have two users use QA.  Observe graphs.  Take heapdump.
> (5) Have three users use QA.  Observe graphs.  Take heapdump.
> (6) Open each heapdump using Eclipse Memory Analyzer.  MAT has useful
> features.  One of them is to generate a leak suspect report.  By just
> viewing the available graphs, you'll get an idea of what classes are
> increasing in size and not being reclaimed.
>
> Others may have other useful tips.
>
> Thanks.
>
>                             -Shanti
>
> On Thu, Oct 4, 2012 at 2:08 PM, Jeff MAURY <je...@gmail.com> wrote:
>
>> Le 4 oct. 2012 14:38, "Caldarale, Charles R" <Ch...@unisys.com>
>> a
>> écrit :
>>>> From: Kirill Kireyev [mailto:kirill@instagrok.com]
>>>> Subject: Re: high CPU usage on tomcat 7
>>>> Perhaps what I need is to have the JVM do garbage collections more
>>>> frequently, so that they don't become a huge CPU-hogging ordeals
>>>> when they do happen.
>>> That's not how it works.  The amount of time spent in a GC operation is
>> dependent almost entirely on the number of live objects in the heap, not
>> the amount of space being reclaimed.  Having a smaller heap might improve
>> your cache hit rate, but it will have little effect on the time used by GC.
>>> It is highly likely that your webapps have a memory leak.  You need to
>> use a profiler and find out what's eating up all that memory.  If you set
>> the heap size to a smaller value, you should be able to provoke the
>> situation earlier, but it certainly won't fix it.
>> +1
>>
>> Jeff
>>>   - Chuck
>>>
>>>
>>> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
>> MATERIAL and is thus for use only by the intended recipient. If you
>> received this in error, please contact the sender and delete the e-mail and
>> its attachments from all computers.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>>


-- 
*Kirill Kireyev, PhD*
Founder/CTO instaGrok.com <http://www.instagrok.com>
kirill@instagrok.com
Twitter: @instaGrok <http://twitter.com/InstaGrok>
FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
<http://www.instagrok.com>

Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Hi Kirill,

Do you happen to have a QA or a test environment by any chance?

25GB is very big for a JVM heap size.

Some questions I have are:
(1) What does the application do?   How many concurrent users do you have?
(2) What was the heap set at before you increased it to 25GB?
(3) Why did you think you needed 25GB?  Is this just an arbitrary size
because your machine has 32GB RAM total and you wanted to allocate out the
JVM size to maximum available?
(4) What symptoms did you face before you increased the heap to 25 GB?
(5) What size are the heapdumps that the earlier script generated?  It
would be a challenge to troubleshoot a heapdump of 20 - 25 GB.

Suggestions for diagnostic steps:
(1) The first step, I think, is to look at the JConsole memory use graph
after application restart and see if the memory is rising linearly
throughout the application's use.  You seem to have a memory leak like
Charles pointed out.
(2) Restart QA.  Bring up JConsole.  Have only one user use the system.
Study JConsole Overview panel - Memory, CPU, threads, #loaded classes.
(3) Take a heapdump.
(4) Have two users use QA.  Observe graphs.  Take heapdump.
(5) Have three users use QA.  Observe graphs.  Take heapdump.
(6) Open each heapdump using Eclipse Memory Analyzer.  MAT has useful
features.  One of them is to generate a leak suspect report.  By just
viewing the available graphs, you'll get an idea of what classes are
increasing in size and not being reclaimed.

Others may have other useful tips.

Thanks.

                           -Shanti

On Thu, Oct 4, 2012 at 2:08 PM, Jeff MAURY <je...@gmail.com> wrote:

> Le 4 oct. 2012 14:38, "Caldarale, Charles R" <Ch...@unisys.com>
> a
> écrit :
> >
> > > From: Kirill Kireyev [mailto:kirill@instagrok.com]
> > > Subject: Re: high CPU usage on tomcat 7
> >
> > > Perhaps what I need is to have the JVM do garbage collections more
> > > frequently, so that they don't become a huge CPU-hogging ordeals
> > > when they do happen.
> >
> > That's not how it works.  The amount of time spent in a GC operation is
> dependent almost entirely on the number of live objects in the heap, not
> the amount of space being reclaimed.  Having a smaller heap might improve
> your cache hit rate, but it will have little effect on the time used by GC.
> >
> > It is highly likely that your webapps have a memory leak.  You need to
> use a profiler and find out what's eating up all that memory.  If you set
> the heap size to a smaller value, you should be able to provoke the
> situation earlier, but it certainly won't fix it.
> +1
>
> Jeff
> >
> >  - Chuck
> >
> >
> > THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
> MATERIAL and is thus for use only by the intended recipient. If you
> received this in error, please contact the sender and delete the e-mail and
> its attachments from all computers.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> >
>

RE: high CPU usage on tomcat 7

Posted by Jeff MAURY <je...@gmail.com>.
Le 4 oct. 2012 14:38, "Caldarale, Charles R" <Ch...@unisys.com> a
écrit :
>
> > From: Kirill Kireyev [mailto:kirill@instagrok.com]
> > Subject: Re: high CPU usage on tomcat 7
>
> > Perhaps what I need is to have the JVM do garbage collections more
> > frequently, so that they don't become a huge CPU-hogging ordeals
> > when they do happen.
>
> That's not how it works.  The amount of time spent in a GC operation is
dependent almost entirely on the number of live objects in the heap, not
the amount of space being reclaimed.  Having a smaller heap might improve
your cache hit rate, but it will have little effect on the time used by GC.
>
> It is highly likely that your webapps have a memory leak.  You need to
use a profiler and find out what's eating up all that memory.  If you set
the heap size to a smaller value, you should be able to provoke the
situation earlier, but it certainly won't fix it.
+1

Jeff
>
>  - Chuck
>
>
> THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you
received this in error, please contact the sender and delete the e-mail and
its attachments from all computers.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

RE: high CPU usage on tomcat 7

Posted by "Caldarale, Charles R" <Ch...@unisys.com>.
> From: Kirill Kireyev [mailto:kirill@instagrok.com] 
> Subject: Re: high CPU usage on tomcat 7

> Perhaps what I need is to have the JVM do garbage collections more 
> frequently, so that they don't become a huge CPU-hogging ordeals 
> when they do happen.

That's not how it works.  The amount of time spent in a GC operation is dependent almost entirely on the number of live objects in the heap, not the amount of space being reclaimed.  Having a smaller heap might improve your cache hit rate, but it will have little effect on the time used by GC.

It is highly likely that your webapps have a memory leak.  You need to use a profiler and find out what's eating up all that memory.  If you set the heap size to a smaller value, you should be able to provoke the situation earlier, but it certainly won't fix it.

 - Chuck 


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: high CPU usage on tomcat 7

Posted by Kirill Kireyev <ki...@instagrok.com>.
Perhaps what I need is to have the JVM do garbage collections more 
frequently, so that they don't become a huge CPU-hogging ordeals when 
they do happen. Maybe setting a smaller heap size will accomplish this, 
or maybe there are GC options to JVM to make GCs more frequent. I will 
read up on garbage collection tuning tomorrow:
http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html

-Kirill

On 10/3/12 9:20 PM, Kirill Kireyev wrote:
> Hi Shanti!
>
> The heap size is because I start tomcat with -Xmx26000M option that I 
> added. Here's the full command line:
> /opt/jdk1.7.0_06//bin/java 
> -Djava.util.logging.config.file=/opt/apache-tomcat-7.0.30/conf/logging.properties 
> -Xmx26000M 
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager 
> -Djava.endorsed.dirs=/opt/apache-tomcat-7.0.30/endorsed -classpath 
> /opt/apache-tomcat-7.0.30/bin/bootstrap.jar:/opt/apache-tomcat-7.0.30/bin/tomcat-juli.jar 
> -Dcatalina.base=/opt/apache-tomcat-7.0.30 
> -Dcatalina.home=/opt/apache-tomcat-7.0.30 
> -Djava.io.tmpdir=/opt/apache-tomcat-7.0.30/temp 
> org.apache.catalina.startup.Bootstrap start
>
> I will look into your other suggestions and reply shortly.
>
> Thanks,
> Kirill
>
> On 10/3/12 9:07 AM, Shanti Suresh wrote:
>> Hi Kirill,
>>
>> I am amazed at the size of the Java Heap of 26 GB.  I didn't expect 
>> that big a size.  Before I suggest anything else, please take a look 
>> at the filesystem space on "/tmp".  The script creates a heapdump 
>> everytime it runs as well.  So please make sure that "/tmp/" has 
>> sufficient storage.  Otherwise it may bring your service down.
>>
>> (1) Could you please paste your JVM command-line arguments here?
>> (2) Also, did you have a chance to observe the JConsole graphs?   I 
>> have attached a sample JConsole graph here.
>> (3) What JDK version are you using?
>> (4) GC for a Heap of that size may need some tuning.
>>
>> You may start JConsole as follows on Linux/Unix.
>> a.Place the following script in your home-directory on the Tomcat 
>> app-server machine .
>> b. Make sure that you have enabled Tomcat to allow JMX access.
>> c. Then log-in from a remote laptop/desktop into the server (using 
>> ssh), and invoke this script from your home-directory:
>> --------start_jconsole.sh------
>> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 
>> 'm#localhost:(\d+).0# && print $1'`
>> echo $DISPLAY_NUMBER
>> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
>> /usr/bin/jconsole 
>> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
>> -------------------------------
>>
>>
>> Inline image 1
>>
>> Thanks.
>>
>>                          -Shanti
>>
>> On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <kirill@instagrok.com 
>> <ma...@instagrok.com>> wrote:
>>
>>     Hi Shanti! Thanks for your script! I ran it (with minor
>>     modifications), and it turns out that the garbage collection
>>     threads are indeed taking up the majority of CPU (see below).
>>     What would you recommend I do to tune/troubleshoot GC further? My
>>     server has 32G or RAM, JVM has heap size of 26G
>>
>>     Thanks,
>>     Kirill
>>
>>     PID    CPU    Mem    JStack Info
>>     0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
>>     tid=0x0000000001938800 nid=0x253d runnable
>>     0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
>>     tid=0x000000000193e000 nid=0x2540 runnable
>>     0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
>>     tid=0x0000000001934800 nid=0x253b runnable
>>     0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
>>     tid=0x000000000193a000 nid=0x253e runnable
>>     0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
>>     tid=0x000000000193f800 nid=0x2541 runnable
>>     0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
>>     tid=0x0000000001936800 nid=0x253c runnable
>>     0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
>>     tid=0x0000000001933000 nid=0x253a runnable
>>     0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
>>     tid=0x000000000193c000 nid=0x253f runnable
>>     0x2542    19    90.7    "VM Thread" prio=10
>>     tid=0x00000000019b6000 nid=0x2542 runnable
>>     0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000
>>     nid=0x2539 runnable [0x00007fee9d444000]
>>     0x2543    0    90.7    "Reference Handler" daemon prio=10
>>     tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
>>     0x2544    0    90.7    "Finalizer" daemon prio=10
>>     tid=0x00000000019c5800 nid=0x2544 in Object.wait()
>>     [0x00007fee4ffd7000]
>>     0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
>>     tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
>>     0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
>>     tid=0x00007fee48001000 nid=0x2546 waiting on condition
>>     [0x0000000000000000]
>>     0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
>>     tid=0x00007fee48004000 nid=0x2547 waiting on condition
>>     [0x0000000000000000]
>>     0x2548    0    90.7    "Service Thread" daemon prio=10
>>     tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
>>     0x2549    0    90.7    "VM Periodic Task Thread" prio=10
>>     tid=0x0000000001a1f800 nid=0x2549 waiting on condition
>>     0x254a    0    90.7    "GC Daemon" daemon prio=10
>>     tid=0x00007fee404b3000 nid=0x254a in Object.wait()
>>     [0x00007fee4e1bd000]
>>     0x254d    0    90.7
>>     "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon
>>     prio=10 tid=0x00007fee40389800 nid=0x254d runnable
>>     [0x00007fee4d820000]
>>     0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
>>     tid=0x00007fee40378000 nid=0x254e in Object.wait()
>>     [0x00007fee4d661000]
>>     0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon
>>     prio=10 tid=0x00007fee40395000 nid=0x254f in Object.wait()
>>     [0x00007fee4d4a2000]
>>     0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
>>     tid=0x00007fee40397000 nid=0x2550 in Object.wait()
>>     [0x00007fee4d2b0000]
>>     0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
>>     tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
>>     0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon
>>     prio=10 tid=0x00007fee4072e000 nid=0x2552
>>     sleeping[0x00007fee4d0ac000]
>>     0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
>>     tid=0x00007fee40730000 nid=0x2553 in Object.wait()
>>     [0x00007fee4cfab000]
>>     0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon
>>     prio=10 tid=0x00007fee40732000 nid=0x2554 in Object.wait()
>>     [0x00007fee4cdee000]
>>     0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
>>     tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
>>     0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon
>>     prio=10 tid=0x00007fee4073b800 nid=0x2556
>>     sleeping[0x00007fee4cbec000]
>>     0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
>>     tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
>>     0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon
>>     prio=10 tid=0x00007fee4073f800 nid=0x2558 in Object.wait()
>>     [0x00007fee4c872000]
>>     0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
>>     tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
>>     0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
>>     tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
>>     0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
>>     tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
>>     0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
>>     tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
>>     0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
>>     tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
>>     0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
>>     tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
>>     0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
>>     tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
>>     0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
>>     tid=0x0000000001cfe800 nid=0x2564 waiting on condition
>>     [0x00007fee47dfa000]
>>     0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
>>     tid=0x0000000001e85000 nid=0x2586 waiting on condition
>>     [0x00007fee47bf6000]
>>     0x2587    0    90.7    "pool-1-thread-1" prio=10
>>     tid=0x0000000001fe4000 nid=0x2587 waiting on condition
>>     [0x00007fee47af3000]
>>     0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
>>     tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
>>     0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
>>     tid=0x0000000002392800 nid=0x2637 waiting on condition
>>     [0x00007fee38790000]
>>     0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
>>     tid=0x0000000002395000 nid=0x2638 waiting on condition
>>     [0x00007fee3868f000]
>>     0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
>>     tid=0x0000000002312800 nid=0x263a waiting on condition
>>     [0x00007fee3848b000]
>>     0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
>>     tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
>>     0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
>>     tid=0x0000000002209000 nid=0x50b waiting on condition
>>     [0x00007fee232b1000]
>>     0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
>>     tid=0x000000000213c000 nid=0x50c waiting on condition
>>     [0x00007fee233b2000]
>>     0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
>>     tid=0x000000000239e000 nid=0x50d waiting on condition
>>     [0x00007fee22baa000]
>>     0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
>>     tid=0x0000000002aeb800 nid=0x50e waiting on condition
>>     [0x00007fee3939d000]
>>     0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
>>     tid=0x0000000002ad3000 nid=0x50f waiting on condition
>>     [0x00007fee38f99000]
>>     0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
>>     tid=0x00000000025a1800 nid=0x510 waiting on condition
>>     [0x00007fee397a1000]
>>     0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
>>     tid=0x0000000002188800 nid=0x511 waiting on condition
>>     [0x00007fee45db8000]
>>     0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
>>     tid=0x00000000023b3800 nid=0x512 waiting on condition
>>     [0x00007fee44aa5000]
>>     0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
>>     tid=0x0000000002164800 nid=0x513 waiting on condition
>>     [0x00007fee3dae4000]
>>     0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
>>     tid=0x0000000002af6800 nid=0x514 waiting on condition
>>     [0x00007fee279f8000]
>>     0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
>>     tid=0x0000000001ac2800 nid=0x19a7 waiting on condition
>>     [0x00007fee3747b000]
>>     0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
>>     tid=0x000000000257d000 nid=0x19a8 waiting on condition
>>     [0x00007fee3767d000]
>>     0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
>>     tid=0x000000000274c000 nid=0x28b0 waiting on condition
>>     [0x00007fee3f8f7000]
>>     0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
>>     tid=0x00000000021c8800 nid=0x28b1 waiting on condition
>>     [0x00007fee3adb7000]
>>     0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
>>     tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
>>     0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
>>     tid=0x000000000242b000 nid=0x29c5 in Object.wait()
>>     [0x00007fee1f877000]
>>     0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
>>     tid=0x0000000002a56000 nid=0x6ef8 waiting on condition
>>     [0x00007fee454af000]
>>     0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
>>     tid=0x00000000022de800 nid=0x6ef9 waiting on condition
>>     [0x00007fee45ab5000]
>>     0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
>>     tid=0x0000000002233000 nid=0x3d38 waiting on condition
>>     [0x00007fee37980000]
>>     0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
>>     tid=0x00000000021f7800 nid=0x3d39 waiting on condition
>>     [0x00007fee38087000]
>>     0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
>>     tid=0x00000000022dd000 nid=0x64a3 waiting on condition
>>     [0x00007fee35158000]
>>     0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
>>     tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
>>     0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
>>     tid=0x00000000029c0800 nid=0x64a7 waiting on condition
>>     [0x00007fee24fce000]
>>     0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
>>     tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
>>     0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
>>     tid=0x0000000002555800 nid=0x64ac waiting on condition
>>     [0x00007fee256d5000]
>>     0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
>>     tid=0x0000000001f8a800 nid=0x64af waiting on condition
>>     [0x00007fee24ccb000]
>>     0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
>>     tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
>>     0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
>>     tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
>>     0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
>>     tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
>>     0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
>>     tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
>>     0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
>>     tid=0x0000000001c9e800 nid=0xf1b waiting on condition
>>     [0x00007fee3929c000]
>>     0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
>>     tid=0x0000000001e4c000 nid=0xf1c waiting on condition
>>     [0x00007fee3d8e2000]
>>     0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
>>     tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
>>     0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
>>     tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
>>     0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
>>     tid=0x000000000211c800 nid=0x51c7 waiting on condition
>>     [0x00007fee37c83000]
>>
>>
>>
>>     On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>>     Hi Kirill,
>>>
>>>     Like Mark, Bill and Jeff said, those threads are normal request-processing
>>>     threads.  I have included a script that might help with isolating high CPU
>>>     issues with Tomcat.
>>>
>>>     Also, I think it might be helpful to see how the Java heap is performing as
>>>     well.
>>>     Please bring up Jconsole and let it run over the week.  Inspect the graphs
>>>     for Memory, CPU and threads.  Since you say that high CPU occurs
>>>     intermittently several times during the week and clears itself, I wonder if
>>>     it is somehow related with the garbage collection options you are using for
>>>     the server.  Or it may be a code-related problem.
>>>
>>>     Things to look at may include:
>>>
>>>     (1) Are high CPU times related to Java heap reductions happening at the
>>>     same time?  ==> GC possibly needs tuning
>>>     (2) Are high CPU times related to increase in thread usage?  ==> possible
>>>     livelock in looping code?
>>>     (3) how many network connections come into the Tomcat server during
>>>     high-CPU times?    Possible overload-related?
>>>
>>>     Here is the script.  I made a couple of small changes, for e.g., changing
>>>     the username.  But didn't test it after the change.  During high-CPU times,
>>>     invoke the script a few times, say 30 seconds apart.  And then compare the
>>>     thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>>>     thread-dumps.
>>>
>>>     Mark, et al, please feel free to help me refine this script.  I would like
>>>     to have a script to catch STUCK threads too :-)  Let me know if anyone has
>>>     a script already.  Thanks.
>>>
>>>     --------------high_cpu_diagnostics.pl:-----
>>>     #!/usr/bin/perl
>>>     #
>>>
>>>     use Cwd;
>>>
>>>     # Make a dated directory for capturing current diagnostics
>>>     my ($sec,$min,$hour,$mday,$mon,$year,
>>>                $wday,$yday,$isdst) = localtime time;
>>>     $year += 1900;
>>>     $mon += 1;
>>>     my $pwd = cwd();
>>>     my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>>>     print "$preview_diag_dir\n";
>>>     mkdir $preview_diag_dir, 0755;
>>>     chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>>>
>>>     # Capture Preview thread dump
>>>     my $process_pattern = "preview";
>>>     my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>>>     my $login = getpwuid($<) ;
>>>     if (kill 0, $preview_pid){
>>>              #Possible to send a signal to the Preview Tomcat - either "webinf"
>>>     or "root"
>>>              my $count = kill 3, $preview_pid;
>>>     }else {
>>>              # Not possible to send a signal to the VCM - use "sudo"
>>>              system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>>>     }
>>>
>>>     # Capture Preview thread dump
>>>     system ("/usr/bin/jmap
>>>     -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>>>
>>>     # Gather the top threads; keep around for reference on what other threads
>>>     are running
>>>     @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>>>     @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>>>     @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>>>     system("@top_cmd 1> top_all_threads.log");
>>>
>>>     # Get your tomcat user's threads, i.e. threads of user, "webinf"
>>>     system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>>>     /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>>
>>>     # Get the thread dump
>>>     my @output=`/usr/bin/jstack -l ${preview_pid}`;
>>>     open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>>>     $!";
>>>     print $file @output;
>>>     close $file;
>>>
>>>     open LOG, "top_user_webinf_threads.log" or die $!;
>>>     open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>>>     print "PID\tCPU\tMem\tJStack Info\n";
>>>     while ($l = <LOG>) {
>>>          chop $l;
>>>          $pid = $l;
>>>          $pid =~ s/webinf.*//g;
>>>          $pid =~ s/ *//g;
>>>     ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>>          $hex_pid = sprintf("%#x", $pid);
>>>          @values = split(/\s+/, $l);
>>>          $pct = $values[8];
>>>          $mem = $values[9];
>>>     # Debugger breakpoint:
>>>     $DB::single = 1;
>>>
>>>     # Find the Java thread that corresponds to the thread-id from the TOP output
>>>          for my $j (@output) {
>>>              chop $j;
>>>              ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
>>>     $mem . "\t" .  $j . "\n";
>>>          }
>>>     }
>>>
>>>     close (STDOUT);
>>>
>>>     close LOG;
>>>
>>>
>>>     ------------------end of script --------------------------
>>>
>>>     Thanks.
>>>
>>>                            -Shanti
>>>
>>>
>>>     On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
>>>     millebi.subscriptions@gmail.com  <ma...@gmail.com>> wrote:
>>>
>>>>     I agree; we have reproducible instances where PermGen is not set to our
>>>>     requirements on the Tomcat startup parameters and it will cause a "lockup"
>>>>     every time. Do some JMX monitoring and you may discover a memory spike
>>>>     that's killing Tomcat.
>>>>
>>>>     Bill
>>>>     -----Original Message-----
>>>>     From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>>>>     Sent: September-27-2012 2:01 PM
>>>>     To: Tomcat Users List
>>>>     Subject: Re: high CPU usage on tomcat 7
>>>>
>>>>     This is probably due to out of memory, I have the same problem on my ubuntu
>>>>     ci machine Did you monitor your tomcat with jmx ?
>>>>
>>>>     Jeff
>>>>
>>>>
>>>>     ---------------------------------------------------------------------
>>>>     To unsubscribe, e-mail:users-unsubscribe@tomcat.apache.org  <ma...@tomcat.apache.org>
>>>>     For additional commands, e-mail:users-help@tomcat.apache.org  <ma...@tomcat.apache.org>
>>>>
>>>>
>>
>>
>>     -- 
>>     *Kirill Kireyev, PhD*
>>     Founder/CTO instaGrok.com <http://www.instagrok.com>
>>     kirill@instagrok.com <ma...@instagrok.com>
>>     Twitter: @instaGrok <http://twitter.com/InstaGrok>
>>     FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>>     <http://www.instagrok.com>
>>
>>
>
>
> -- 
> *Kirill Kireyev, PhD*
> Founder/CTO instaGrok.com <http://www.instagrok.com>
> kirill@instagrok.com
> Twitter: @instaGrok <http://twitter.com/InstaGrok>
> FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
> <http://www.instagrok.com>


-- 
*Kirill Kireyev, PhD*
Founder/CTO instaGrok.com <http://www.instagrok.com>
kirill@instagrok.com
Twitter: @instaGrok <http://twitter.com/InstaGrok>
FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
<http://www.instagrok.com>

Re: high CPU usage on tomcat 7

Posted by Kirill Kireyev <ki...@instagrok.com>.
Hi Shanti!

The heap size is because I start tomcat with -Xmx26000M option that I 
added. Here's the full command line:
/opt/jdk1.7.0_06//bin/java 
-Djava.util.logging.config.file=/opt/apache-tomcat-7.0.30/conf/logging.properties 
-Xmx26000M 
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager 
-Djava.endorsed.dirs=/opt/apache-tomcat-7.0.30/endorsed -classpath 
/opt/apache-tomcat-7.0.30/bin/bootstrap.jar:/opt/apache-tomcat-7.0.30/bin/tomcat-juli.jar 
-Dcatalina.base=/opt/apache-tomcat-7.0.30 
-Dcatalina.home=/opt/apache-tomcat-7.0.30 
-Djava.io.tmpdir=/opt/apache-tomcat-7.0.30/temp 
org.apache.catalina.startup.Bootstrap start

I will look into your other suggestions and reply shortly.

Thanks,
Kirill

On 10/3/12 9:07 AM, Shanti Suresh wrote:
> Hi Kirill,
>
> I am amazed at the size of the Java Heap of 26 GB.  I didn't expect 
> that big a size.  Before I suggest anything else, please take a look 
> at the filesystem space on "/tmp".  The script creates a heapdump 
> everytime it runs as well.  So please make sure that "/tmp/" has 
> sufficient storage.  Otherwise it may bring your service down.
>
> (1) Could you please paste your JVM command-line arguments here?
> (2) Also, did you have a chance to observe the JConsole graphs? I have 
> attached a sample JConsole graph here.
> (3) What JDK version are you using?
> (4) GC for a Heap of that size may need some tuning.
>
> You may start JConsole as follows on Linux/Unix.
> a.Place the following script in your home-directory on the Tomcat 
> app-server machine .
> b. Make sure that you have enabled Tomcat to allow JMX access.
> c. Then log-in from a remote laptop/desktop into the server (using 
> ssh), and invoke this script from your home-directory:
> --------start_jconsole.sh------
> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 
> 'm#localhost:(\d+).0# && print $1'`
> echo $DISPLAY_NUMBER
> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
> /usr/bin/jconsole 
> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
> -------------------------------
>
>
> Inline image 1
>
> Thanks.
>
>                          -Shanti
>
> On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <kirill@instagrok.com 
> <ma...@instagrok.com>> wrote:
>
>     Hi Shanti! Thanks for your script! I ran it (with minor
>     modifications), and it turns out that the garbage collection
>     threads are indeed taking up the majority of CPU (see below). What
>     would you recommend I do to tune/troubleshoot GC further? My
>     server has 32G or RAM, JVM has heap size of 26G
>
>     Thanks,
>     Kirill
>
>     PID    CPU    Mem    JStack Info
>     0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
>     tid=0x0000000001938800 nid=0x253d runnable
>     0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
>     tid=0x000000000193e000 nid=0x2540 runnable
>     0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
>     tid=0x0000000001934800 nid=0x253b runnable
>     0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
>     tid=0x000000000193a000 nid=0x253e runnable
>     0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
>     tid=0x000000000193f800 nid=0x2541 runnable
>     0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
>     tid=0x0000000001936800 nid=0x253c runnable
>     0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
>     tid=0x0000000001933000 nid=0x253a runnable
>     0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
>     tid=0x000000000193c000 nid=0x253f runnable
>     0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000
>     nid=0x2542 runnable
>     0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000
>     nid=0x2539 runnable [0x00007fee9d444000]
>     0x2543    0    90.7    "Reference Handler" daemon prio=10
>     tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
>     0x2544    0    90.7    "Finalizer" daemon prio=10
>     tid=0x00000000019c5800 nid=0x2544 in Object.wait()
>     [0x00007fee4ffd7000]
>     0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
>     tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
>     0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
>     tid=0x00007fee48001000 nid=0x2546 waiting on condition
>     [0x0000000000000000]
>     0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
>     tid=0x00007fee48004000 nid=0x2547 waiting on condition
>     [0x0000000000000000]
>     0x2548    0    90.7    "Service Thread" daemon prio=10
>     tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
>     0x2549    0    90.7    "VM Periodic Task Thread" prio=10
>     tid=0x0000000001a1f800 nid=0x2549 waiting on condition
>     0x254a    0    90.7    "GC Daemon" daemon prio=10
>     tid=0x00007fee404b3000 nid=0x254a in Object.wait()
>     [0x00007fee4e1bd000]
>     0x254d    0    90.7
>     "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon
>     prio=10 tid=0x00007fee40389800 nid=0x254d runnable
>     [0x00007fee4d820000]
>     0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
>     tid=0x00007fee40378000 nid=0x254e in Object.wait()
>     [0x00007fee4d661000]
>     0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon
>     prio=10 tid=0x00007fee40395000 nid=0x254f in Object.wait()
>     [0x00007fee4d4a2000]
>     0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
>     tid=0x00007fee40397000 nid=0x2550 in Object.wait()
>     [0x00007fee4d2b0000]
>     0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
>     tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
>     0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10
>     tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
>     0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
>     tid=0x00007fee40730000 nid=0x2553 in Object.wait()
>     [0x00007fee4cfab000]
>     0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon
>     prio=10 tid=0x00007fee40732000 nid=0x2554 in Object.wait()
>     [0x00007fee4cdee000]
>     0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
>     tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
>     0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10
>     tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
>     0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
>     tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
>     0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10
>     tid=0x00007fee4073f800 nid=0x2558 in Object.wait()
>     [0x00007fee4c872000]
>     0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
>     tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
>     0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
>     tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
>     0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
>     tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
>     0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
>     tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
>     0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
>     tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
>     0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
>     tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
>     0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
>     tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
>     0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
>     tid=0x0000000001cfe800 nid=0x2564 waiting on condition
>     [0x00007fee47dfa000]
>     0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
>     tid=0x0000000001e85000 nid=0x2586 waiting on condition
>     [0x00007fee47bf6000]
>     0x2587    0    90.7    "pool-1-thread-1" prio=10
>     tid=0x0000000001fe4000 nid=0x2587 waiting on condition
>     [0x00007fee47af3000]
>     0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
>     tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
>     0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
>     tid=0x0000000002392800 nid=0x2637 waiting on condition
>     [0x00007fee38790000]
>     0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
>     tid=0x0000000002395000 nid=0x2638 waiting on condition
>     [0x00007fee3868f000]
>     0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
>     tid=0x0000000002312800 nid=0x263a waiting on condition
>     [0x00007fee3848b000]
>     0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
>     tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
>     0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
>     tid=0x0000000002209000 nid=0x50b waiting on condition
>     [0x00007fee232b1000]
>     0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
>     tid=0x000000000213c000 nid=0x50c waiting on condition
>     [0x00007fee233b2000]
>     0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
>     tid=0x000000000239e000 nid=0x50d waiting on condition
>     [0x00007fee22baa000]
>     0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
>     tid=0x0000000002aeb800 nid=0x50e waiting on condition
>     [0x00007fee3939d000]
>     0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
>     tid=0x0000000002ad3000 nid=0x50f waiting on condition
>     [0x00007fee38f99000]
>     0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
>     tid=0x00000000025a1800 nid=0x510 waiting on condition
>     [0x00007fee397a1000]
>     0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
>     tid=0x0000000002188800 nid=0x511 waiting on condition
>     [0x00007fee45db8000]
>     0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
>     tid=0x00000000023b3800 nid=0x512 waiting on condition
>     [0x00007fee44aa5000]
>     0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
>     tid=0x0000000002164800 nid=0x513 waiting on condition
>     [0x00007fee3dae4000]
>     0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
>     tid=0x0000000002af6800 nid=0x514 waiting on condition
>     [0x00007fee279f8000]
>     0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
>     tid=0x0000000001ac2800 nid=0x19a7 waiting on condition
>     [0x00007fee3747b000]
>     0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
>     tid=0x000000000257d000 nid=0x19a8 waiting on condition
>     [0x00007fee3767d000]
>     0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
>     tid=0x000000000274c000 nid=0x28b0 waiting on condition
>     [0x00007fee3f8f7000]
>     0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
>     tid=0x00000000021c8800 nid=0x28b1 waiting on condition
>     [0x00007fee3adb7000]
>     0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
>     tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
>     0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
>     tid=0x000000000242b000 nid=0x29c5 in Object.wait()
>     [0x00007fee1f877000]
>     0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
>     tid=0x0000000002a56000 nid=0x6ef8 waiting on condition
>     [0x00007fee454af000]
>     0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
>     tid=0x00000000022de800 nid=0x6ef9 waiting on condition
>     [0x00007fee45ab5000]
>     0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
>     tid=0x0000000002233000 nid=0x3d38 waiting on condition
>     [0x00007fee37980000]
>     0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
>     tid=0x00000000021f7800 nid=0x3d39 waiting on condition
>     [0x00007fee38087000]
>     0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
>     tid=0x00000000022dd000 nid=0x64a3 waiting on condition
>     [0x00007fee35158000]
>     0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
>     tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
>     0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
>     tid=0x00000000029c0800 nid=0x64a7 waiting on condition
>     [0x00007fee24fce000]
>     0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
>     tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
>     0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
>     tid=0x0000000002555800 nid=0x64ac waiting on condition
>     [0x00007fee256d5000]
>     0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
>     tid=0x0000000001f8a800 nid=0x64af waiting on condition
>     [0x00007fee24ccb000]
>     0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
>     tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
>     0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
>     tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
>     0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
>     tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
>     0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
>     tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
>     0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
>     tid=0x0000000001c9e800 nid=0xf1b waiting on condition
>     [0x00007fee3929c000]
>     0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
>     tid=0x0000000001e4c000 nid=0xf1c waiting on condition
>     [0x00007fee3d8e2000]
>     0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
>     tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
>     0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
>     tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
>     0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
>     tid=0x000000000211c800 nid=0x51c7 waiting on condition
>     [0x00007fee37c83000]
>
>
>
>     On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>     Hi Kirill,
>>
>>     Like Mark, Bill and Jeff said, those threads are normal request-processing
>>     threads.  I have included a script that might help with isolating high CPU
>>     issues with Tomcat.
>>
>>     Also, I think it might be helpful to see how the Java heap is performing as
>>     well.
>>     Please bring up Jconsole and let it run over the week.  Inspect the graphs
>>     for Memory, CPU and threads.  Since you say that high CPU occurs
>>     intermittently several times during the week and clears itself, I wonder if
>>     it is somehow related with the garbage collection options you are using for
>>     the server.  Or it may be a code-related problem.
>>
>>     Things to look at may include:
>>
>>     (1) Are high CPU times related to Java heap reductions happening at the
>>     same time?  ==> GC possibly needs tuning
>>     (2) Are high CPU times related to increase in thread usage?  ==> possible
>>     livelock in looping code?
>>     (3) how many network connections come into the Tomcat server during
>>     high-CPU times?    Possible overload-related?
>>
>>     Here is the script.  I made a couple of small changes, for e.g., changing
>>     the username.  But didn't test it after the change.  During high-CPU times,
>>     invoke the script a few times, say 30 seconds apart.  And then compare the
>>     thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>>     thread-dumps.
>>
>>     Mark, et al, please feel free to help me refine this script.  I would like
>>     to have a script to catch STUCK threads too :-)  Let me know if anyone has
>>     a script already.  Thanks.
>>
>>     --------------high_cpu_diagnostics.pl:-----
>>     #!/usr/bin/perl
>>     #
>>
>>     use Cwd;
>>
>>     # Make a dated directory for capturing current diagnostics
>>     my ($sec,$min,$hour,$mday,$mon,$year,
>>                $wday,$yday,$isdst) = localtime time;
>>     $year += 1900;
>>     $mon += 1;
>>     my $pwd = cwd();
>>     my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>>     print "$preview_diag_dir\n";
>>     mkdir $preview_diag_dir, 0755;
>>     chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>>
>>     # Capture Preview thread dump
>>     my $process_pattern = "preview";
>>     my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>>     my $login = getpwuid($<) ;
>>     if (kill 0, $preview_pid){
>>              #Possible to send a signal to the Preview Tomcat - either "webinf"
>>     or "root"
>>              my $count = kill 3, $preview_pid;
>>     }else {
>>              # Not possible to send a signal to the VCM - use "sudo"
>>              system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>>     }
>>
>>     # Capture Preview thread dump
>>     system ("/usr/bin/jmap
>>     -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>>
>>     # Gather the top threads; keep around for reference on what other threads
>>     are running
>>     @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>>     @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>>     @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>>     system("@top_cmd 1> top_all_threads.log");
>>
>>     # Get your tomcat user's threads, i.e. threads of user, "webinf"
>>     system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>>     /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>
>>     # Get the thread dump
>>     my @output=`/usr/bin/jstack -l ${preview_pid}`;
>>     open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>>     $!";
>>     print $file @output;
>>     close $file;
>>
>>     open LOG, "top_user_webinf_threads.log" or die $!;
>>     open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>>     print "PID\tCPU\tMem\tJStack Info\n";
>>     while ($l = <LOG>) {
>>          chop $l;
>>          $pid = $l;
>>          $pid =~ s/webinf.*//g;
>>          $pid =~ s/ *//g;
>>     ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>          $hex_pid = sprintf("%#x", $pid);
>>          @values = split(/\s+/, $l);
>>          $pct = $values[8];
>>          $mem = $values[9];
>>     # Debugger breakpoint:
>>     $DB::single = 1;
>>
>>     # Find the Java thread that corresponds to the thread-id from the TOP output
>>          for my $j (@output) {
>>              chop $j;
>>              ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
>>     $mem . "\t" .  $j . "\n";
>>          }
>>     }
>>
>>     close (STDOUT);
>>
>>     close LOG;
>>
>>
>>     ------------------end of script --------------------------
>>
>>     Thanks.
>>
>>                            -Shanti
>>
>>
>>     On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
>>     millebi.subscriptions@gmail.com  <ma...@gmail.com>> wrote:
>>
>>>     I agree; we have reproducible instances where PermGen is not set to our
>>>     requirements on the Tomcat startup parameters and it will cause a "lockup"
>>>     every time. Do some JMX monitoring and you may discover a memory spike
>>>     that's killing Tomcat.
>>>
>>>     Bill
>>>     -----Original Message-----
>>>     From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>>>     Sent: September-27-2012 2:01 PM
>>>     To: Tomcat Users List
>>>     Subject: Re: high CPU usage on tomcat 7
>>>
>>>     This is probably due to out of memory, I have the same problem on my ubuntu
>>>     ci machine Did you monitor your tomcat with jmx ?
>>>
>>>     Jeff
>>>
>>>
>>>     ---------------------------------------------------------------------
>>>     To unsubscribe, e-mail:users-unsubscribe@tomcat.apache.org  <ma...@tomcat.apache.org>
>>>     For additional commands, e-mail:users-help@tomcat.apache.org  <ma...@tomcat.apache.org>
>>>
>>>
>
>
>     -- 
>     *Kirill Kireyev, PhD*
>     Founder/CTO instaGrok.com <http://www.instagrok.com>
>     kirill@instagrok.com <ma...@instagrok.com>
>     Twitter: @instaGrok <http://twitter.com/InstaGrok>
>     FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>     <http://www.instagrok.com>
>
>


-- 
*Kirill Kireyev, PhD*
Founder/CTO instaGrok.com <http://www.instagrok.com>
kirill@instagrok.com
Twitter: @instaGrok <http://twitter.com/InstaGrok>
FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
<http://www.instagrok.com>

Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Hi Kirill,

I am amazed at the size of the Java Heap of 26 GB.  I didn't expect that
big a size.  Before I suggest anything else, please take a look at the
filesystem space on "/tmp".  The script creates a heapdump everytime it
runs as well.  So please make sure that "/tmp/" has sufficient storage.
Otherwise it may bring your service down.

(1) Could you please paste your JVM command-line arguments here?
(2) Also, did you have a chance to observe the JConsole graphs?   I have
attached a sample JConsole graph here.
(3) What JDK version are you using?
(4) GC for a Heap of that size may need some tuning.

You may start JConsole as follows on Linux/Unix.
a.Place the following script in your home-directory on the Tomcat
app-server machine .
b. Make sure that you have enabled Tomcat to allow JMX access.
c. Then log-in from a remote laptop/desktop into the server (using ssh),
and invoke this script from your home-directory:
--------start_jconsole.sh------
DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
&& print $1'`
echo $DISPLAY_NUMBER
XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
/usr/bin/jconsole
-J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
-------------------------------


[image: Inline image 1]

Thanks.

                         -Shanti

On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <ki...@instagrok.com> wrote:

>  Hi Shanti! Thanks for your script! I ran it (with minor modifications),
> and it turns out that the garbage collection threads are indeed taking up
> the majority of CPU (see below). What would you recommend I do to
> tune/troubleshoot GC further? My server has 32G or RAM, JVM has heap size
> of 26G
>
> Thanks,
> Kirill
>
> PID    CPU    Mem    JStack Info
> 0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
> tid=0x0000000001938800 nid=0x253d runnable
> 0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
> tid=0x000000000193e000 nid=0x2540 runnable
> 0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
> tid=0x0000000001934800 nid=0x253b runnable
> 0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
> tid=0x000000000193a000 nid=0x253e runnable
> 0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
> tid=0x000000000193f800 nid=0x2541 runnable
> 0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
> tid=0x0000000001936800 nid=0x253c runnable
> 0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
> tid=0x0000000001933000 nid=0x253a runnable
> 0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
> tid=0x000000000193c000 nid=0x253f runnable
> 0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000
> nid=0x2542 runnable
> 0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000 nid=0x2539
> runnable [0x00007fee9d444000]
> 0x2543    0    90.7    "Reference Handler" daemon prio=10
> tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
> 0x2544    0    90.7    "Finalizer" daemon prio=10 tid=0x00000000019c5800
> nid=0x2544 in Object.wait() [0x00007fee4ffd7000]
> 0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
> tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
> 0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
> tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000]
> 0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
> tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000]
> 0x2548    0    90.7    "Service Thread" daemon prio=10
> tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
> 0x2549    0    90.7    "VM Periodic Task Thread" prio=10
> tid=0x0000000001a1f800 nid=0x2549 waiting on condition
> 0x254a    0    90.7    "GC Daemon" daemon prio=10 tid=0x00007fee404b3000
> nid=0x254a in Object.wait() [0x00007fee4e1bd000]
> 0x254d    0    90.7
> "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10
> tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000]
> 0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
> tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000]
> 0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon prio=10
> tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000]
> 0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
> tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000]
> 0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
> tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
> 0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10
> tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
> 0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
> tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000]
> 0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon prio=10
> tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000]
> 0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
> tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
> 0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10
> tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
> 0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
> tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
> 0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10
> tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000]
> 0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
> tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
> 0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
> tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
> 0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
> tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
> 0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
> tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
> 0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
> tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
> 0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
> tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
> 0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
> tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
> 0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
> tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000]
> 0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
> tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000]
> 0x2587    0    90.7    "pool-1-thread-1" prio=10 tid=0x0000000001fe4000
> nid=0x2587 waiting on condition [0x00007fee47af3000]
> 0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
> tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
> 0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
> tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000]
> 0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
> tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000]
> 0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
> tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000]
> 0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
> tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
> 0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
> tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000]
> 0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
> tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000]
> 0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
> tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000]
> 0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
> tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000]
> 0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
> tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000]
> 0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
> tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000]
> 0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
> tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000]
> 0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
> tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000]
> 0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
> tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000]
> 0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
> tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000]
> 0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
> tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000]
> 0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
> tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000]
> 0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
> tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000]
> 0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
> tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000]
> 0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
> tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
> 0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
> tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000]
> 0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
> tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000]
> 0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
> tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000]
> 0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
> tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000]
> 0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
> tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000]
> 0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
> tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000]
> 0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
> tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
> 0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
> tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000]
> 0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
> tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
> 0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
> tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000]
> 0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
> tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000]
> 0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
> tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
> 0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
> tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
> 0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
> tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
> 0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
> tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
> 0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
> tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000]
> 0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
> tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000]
> 0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
> tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
> 0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
> tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
> 0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
> tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000]
>
>
>
> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>
> Hi Kirill,
>
> Like Mark, Bill and Jeff said, those threads are normal request-processing
> threads.  I have included a script that might help with isolating high CPU
> issues with Tomcat.
>
> Also, I think it might be helpful to see how the Java heap is performing as
> well.
> Please bring up Jconsole and let it run over the week.  Inspect the graphs
> for Memory, CPU and threads.  Since you say that high CPU occurs
> intermittently several times during the week and clears itself, I wonder if
> it is somehow related with the garbage collection options you are using for
> the server.  Or it may be a code-related problem.
>
> Things to look at may include:
>
> (1) Are high CPU times related to Java heap reductions happening at the
> same time?  ==> GC possibly needs tuning
> (2) Are high CPU times related to increase in thread usage?  ==> possible
> livelock in looping code?
> (3) how many network connections come into the Tomcat server during
> high-CPU times?    Possible overload-related?
>
> Here is the script.  I made a couple of small changes, for e.g., changing
> the username.  But didn't test it after the change.  During high-CPU times,
> invoke the script a few times, say 30 seconds apart.  And then compare the
> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
> thread-dumps.
>
> Mark, et al, please feel free to help me refine this script.  I would like
> to have a script to catch STUCK threads too :-)  Let me know if anyone has
> a script already.  Thanks.
>
> --------------high_cpu_diagnostics.pl:-----
> #!/usr/bin/perl
> #
>
> use Cwd;
>
> # Make a dated directory for capturing current diagnostics
> my ($sec,$min,$hour,$mday,$mon,$year,
>           $wday,$yday,$isdst) = localtime time;
> $year += 1900;
> $mon += 1;
> my $pwd = cwd();
> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
> print "$preview_diag_dir\n";
> mkdir $preview_diag_dir, 0755;
> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>
> # Capture Preview thread dump
> my $process_pattern = "preview";
> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
> my $login = getpwuid($<) ;
> if (kill 0, $preview_pid){
>         #Possible to send a signal to the Preview Tomcat - either "webinf"
> or "root"
>         my $count = kill 3, $preview_pid;
> }else {
>         # Not possible to send a signal to the VCM - use "sudo"
>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
> }
>
> # Capture Preview thread dump
> system ("/usr/bin/jmap
> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>
> # Gather the top threads; keep around for reference on what other threads
> are running
> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
> system("@top_cmd 1> top_all_threads.log");
>
> # Get your tomcat user's threads, i.e. threads of user, "webinf"
> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>
> # Get the thread dump
> my @output=`/usr/bin/jstack -l ${preview_pid}`;
> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
> $!";
> print $file @output;
> close $file;
>
> open LOG, "top_user_webinf_threads.log" or die $!;
> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
> print "PID\tCPU\tMem\tJStack Info\n";
> while ($l = <LOG>) {
>     chop $l;
>     $pid = $l;
>     $pid =~ s/webinf.*//g;
>     $pid =~ s/ *//g;
> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>     $hex_pid = sprintf("%#x", $pid);
>     @values = split(/\s+/, $l);
>     $pct = $values[8];
>     $mem = $values[9];
> # Debugger breakpoint:
> $DB::single = 1;
>
> # Find the Java thread that corresponds to the thread-id from the TOP output
>     for my $j (@output) {
>         chop $j;
>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
> $mem . "\t" .  $j . "\n";
>     }
> }
>
> close (STDOUT);
>
> close LOG;
>
>
> ------------------end of script --------------------------
>
> Thanks.
>
>                       -Shanti
>
>
> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <mi...@gmail.com> wrote:
>
>
>  I agree; we have reproducible instances where PermGen is not set to our
> requirements on the Tomcat startup parameters and it will cause a "lockup"
> every time. Do some JMX monitoring and you may discover a memory spike
> that's killing Tomcat.
>
> Bill
> -----Original Message-----
> From: Jeff MAURY [mailto:jeffmaury@gmail.com <je...@gmail.com>]
> Sent: September-27-2012 2:01 PM
> To: Tomcat Users List
> Subject: Re: high CPU usage on tomcat 7
>
> This is probably due to out of memory, I have the same problem on my ubuntu
> ci machine Did you monitor your tomcat with jmx ?
>
> Jeff
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>
>
> --
> *Kirill Kireyev, PhD*
> Founder/CTO instaGrok.com <http://www.instagrok.com>
> kirill@instagrok.com
> Twitter: @instaGrok <http://twitter.com/InstaGrok>
> FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>  <http://www.instagrok.com>
>

Re: high CPU usage on tomcat 7

Posted by Kirill Kireyev <ki...@instagrok.com>.
Hi Shanti! Thanks for your script! I ran it (with minor modifications), 
and it turns out that the garbage collection threads are indeed taking 
up the majority of CPU (see below). What would you recommend I do to 
tune/troubleshoot GC further? My server has 32G or RAM, JVM has heap 
size of 26G

Thanks,
Kirill

PID    CPU    Mem    JStack Info
0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10 
tid=0x0000000001938800 nid=0x253d runnable
0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10 
tid=0x000000000193e000 nid=0x2540 runnable
0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10 
tid=0x0000000001934800 nid=0x253b runnable
0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10 
tid=0x000000000193a000 nid=0x253e runnable
0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10 
tid=0x000000000193f800 nid=0x2541 runnable
0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10 
tid=0x0000000001936800 nid=0x253c runnable
0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10 
tid=0x0000000001933000 nid=0x253a runnable
0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10 
tid=0x000000000193c000 nid=0x253f runnable
0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000 
nid=0x2542 runnable
0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000 nid=0x2539 
runnable [0x00007fee9d444000]
0x2543    0    90.7    "Reference Handler" daemon prio=10 
tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
0x2544    0    90.7    "Finalizer" daemon prio=10 tid=0x00000000019c5800 
nid=0x2544 in Object.wait() [0x00007fee4ffd7000]
0x2545    0    90.7    "Signal Dispatcher" daemon prio=10 
tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10 
tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000]
0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10 
tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000]
0x2548    0    90.7    "Service Thread" daemon prio=10 
tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
0x2549    0    90.7    "VM Periodic Task Thread" prio=10 
tid=0x0000000001a1f800 nid=0x2549 waiting on condition
0x254a    0    90.7    "GC Daemon" daemon prio=10 tid=0x00007fee404b3000 
nid=0x254a in Object.wait() [0x00007fee4e1bd000]
0x254d    0    90.7 
"ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10 
tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000]
0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10 
tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000]
0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon prio=10 
tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000]
0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10 
tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000]
0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10 
tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10 
tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10 
tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000]
0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon prio=10 
tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000]
0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10 
tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10 
tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10 
tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10 
tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000]
0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10 
tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10 
tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10 
tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10 
tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10 
tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10 
tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10 
tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10 
tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000]
0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10 
tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000]
0x2587    0    90.7    "pool-1-thread-1" prio=10 tid=0x0000000001fe4000 
nid=0x2587 waiting on condition [0x00007fee47af3000]
0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10 
tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10 
tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000]
0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10 
tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000]
0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10 
tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000]
0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10 
tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10 
tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000]
0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10 
tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000]
0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10 
tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000]
0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10 
tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000]
0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10 
tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000]
0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10 
tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000]
0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10 
tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000]
0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10 
tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000]
0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10 
tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000]
0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10 
tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000]
0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10 
tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000]
0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10 
tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000]
0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10 
tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000]
0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10 
tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000]
0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10 
tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10 
tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000]
0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10 
tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000]
0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10 
tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000]
0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10 
tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000]
0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10 
tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000]
0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10 
tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000]
0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10 
tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10 
tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000]
0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10 
tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10 
tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000]
0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10 
tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000]
0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10 
tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10 
tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10 
tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10 
tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10 
tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000]
0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10 
tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000]
0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10 
tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10 
tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10 
tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000]


On 9/27/12 12:17 PM, Shanti Suresh wrote:
> Hi Kirill,
>
> Like Mark, Bill and Jeff said, those threads are normal request-processing
> threads.  I have included a script that might help with isolating high CPU
> issues with Tomcat.
>
> Also, I think it might be helpful to see how the Java heap is performing as
> well.
> Please bring up Jconsole and let it run over the week.  Inspect the graphs
> for Memory, CPU and threads.  Since you say that high CPU occurs
> intermittently several times during the week and clears itself, I wonder if
> it is somehow related with the garbage collection options you are using for
> the server.  Or it may be a code-related problem.
>
> Things to look at may include:
>
> (1) Are high CPU times related to Java heap reductions happening at the
> same time?  ==> GC possibly needs tuning
> (2) Are high CPU times related to increase in thread usage?  ==> possible
> livelock in looping code?
> (3) how many network connections come into the Tomcat server during
> high-CPU times?    Possible overload-related?
>
> Here is the script.  I made a couple of small changes, for e.g., changing
> the username.  But didn't test it after the change.  During high-CPU times,
> invoke the script a few times, say 30 seconds apart.  And then compare the
> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
> thread-dumps.
>
> Mark, et al, please feel free to help me refine this script.  I would like
> to have a script to catch STUCK threads too :-)  Let me know if anyone has
> a script already.  Thanks.
>
> --------------high_cpu_diagnostics.pl:-----
> #!/usr/bin/perl
> #
>
> use Cwd;
>
> # Make a dated directory for capturing current diagnostics
> my ($sec,$min,$hour,$mday,$mon,$year,
>            $wday,$yday,$isdst) = localtime time;
> $year += 1900;
> $mon += 1;
> my $pwd = cwd();
> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
> print "$preview_diag_dir\n";
> mkdir $preview_diag_dir, 0755;
> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>
> # Capture Preview thread dump
> my $process_pattern = "preview";
> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
> my $login = getpwuid($<) ;
> if (kill 0, $preview_pid){
>          #Possible to send a signal to the Preview Tomcat - either "webinf"
> or "root"
>          my $count = kill 3, $preview_pid;
> }else {
>          # Not possible to send a signal to the VCM - use "sudo"
>          system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
> }
>
> # Capture Preview thread dump
> system ("/usr/bin/jmap
> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>
> # Gather the top threads; keep around for reference on what other threads
> are running
> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
> system("@top_cmd 1> top_all_threads.log");
>
> # Get your tomcat user's threads, i.e. threads of user, "webinf"
> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>
> # Get the thread dump
> my @output=`/usr/bin/jstack -l ${preview_pid}`;
> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
> $!";
> print $file @output;
> close $file;
>
> open LOG, "top_user_webinf_threads.log" or die $!;
> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
> print "PID\tCPU\tMem\tJStack Info\n";
> while ($l = <LOG>) {
>      chop $l;
>      $pid = $l;
>      $pid =~ s/webinf.*//g;
>      $pid =~ s/ *//g;
> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>      $hex_pid = sprintf("%#x", $pid);
>      @values = split(/\s+/, $l);
>      $pct = $values[8];
>      $mem = $values[9];
> # Debugger breakpoint:
> $DB::single = 1;
>
> # Find the Java thread that corresponds to the thread-id from the TOP output
>      for my $j (@output) {
>          chop $j;
>          ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
> $mem . "\t" .  $j . "\n";
>      }
> }
>
> close (STDOUT);
>
> close LOG;
>
>
> ------------------end of script --------------------------
>
> Thanks.
>
>                        -Shanti
>
>
> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
> millebi.subscriptions@gmail.com> wrote:
>
>> I agree; we have reproducible instances where PermGen is not set to our
>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>> every time. Do some JMX monitoring and you may discover a memory spike
>> that's killing Tomcat.
>>
>> Bill
>> -----Original Message-----
>> From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>> Sent: September-27-2012 2:01 PM
>> To: Tomcat Users List
>> Subject: Re: high CPU usage on tomcat 7
>>
>> This is probably due to out of memory, I have the same problem on my ubuntu
>> ci machine Did you monitor your tomcat with jmx ?
>>
>> Jeff
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>


-- 
*Kirill Kireyev, PhD*
Founder/CTO instaGrok.com <http://www.instagrok.com>
kirill@instagrok.com
Twitter: @instaGrok <http://twitter.com/InstaGrok>
FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
<http://www.instagrok.com>

Re: high CPU usage on tomcat 7

Posted by "mailinglist@j-b-s.de" <ma...@j-b-s.de>.
Well, if you have 4 cores and all cores are looping tomcat definitely will not respond any more...

Von meinem iPad gesendet

Am 30.09.2012 um 12:42 schrieb Jeff MAURY <je...@jeffmaury.com>:

> I don't think a cpu loop will make tomcat stopping responding to requests
> I will make it very slow to respond
> But a shortage on memory is hard to recover
> 
> Jeff
> 
> Le vendredi 28 septembre 2012, mailinglist@j-b-s.de a écrit :
> 
>> Maybe an infinite loop? We observed something similar due to a bug in the
>> java regex impl and certain user input causes this regex looping behaviour.
>> As this locked one core but to the user it simply looked like server was
>> not responding, guess what happend? Right: they press refresh page and next
>> core gone :-)
>> So running state might be evil, too.
>> Switch on GC logging, maybe its just related to a full gc. In case this
>> happens again take a thread dump and search similarities. Take multiple
>> dumps in a 5 sec interval. Try to find long running threads (in our case we
>> noticed the regex bug)
>> 
>> Jens
>> 
>> Sent from my iPhone
>> 
>> On 27.09.2012, at 22:05, Kirill Kireyev <ki...@instagrok.com> wrote:
>> 
>>> Thanks for all the advice everyone! There is a possibility that the CPU
>> is caused by an app thread - I am looking into that possibility. Will let
>> you know when I find out more.
>>> 
>>> Thanks,
>>> Kirill
>>> 
>>> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>>> Hi Kirill,
>>>> 
>>>> Like Mark, Bill and Jeff said, those threads are normal
>> request-processing
>>>> threads.  I have included a script that might help with isolating high
>> CPU
>>>> issues with Tomcat.
>>>> 
>>>> Also, I think it might be helpful to see how the Java heap is
>> performing as
>>>> well.
>>>> Please bring up Jconsole and let it run over the week.  Inspect the
>> graphs
>>>> for Memory, CPU and threads.  Since you say that high CPU occurs
>>>> intermittently several times during the week and clears itself, I
>> wonder if
>>>> it is somehow related with the garbage collection options you are using
>> for
>>>> the server.  Or it may be a code-related problem.
>>>> 
>>>> Things to look at may include:
>>>> 
>>>> (1) Are high CPU times related to Java heap reductions happening at the
>>>> same time?  ==> GC possibly needs tuning
>>>> (2) Are high CPU times related to increase in thread usage?  ==>
>> possible
>>>> livelock in looping code?
>>>> (3) how many network connections come into the Tomcat server during
>>>> high-CPU times?    Possible overload-related?
>>>> 
>>>> Here is the script.  I made a couple of small changes, for e.g.,
>> changing
>>>> the username.  But didn't test it after the change.  During high-CPU
>> times,
>>>> invoke the script a few times, say 30 seconds apart.  And then compare
>> the
>>>> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>>>> thread-dumps.
>>>> 
>>>> Mark, et al, please feel free to help me refine this script.  I would
>> like
>>>> to have a script to catch STUCK threads too :-)  Let me know if anyone
>> has
>>>> a script already.  Thanks.
>>>> 
>>>> --------------high_cpu_diagnostics.pl:-----
>>>> #!/usr/bin/perl
>>>> #
>>>> 
>>>> use Cwd;
>>>> 
>>>> # Make a dated directory for capturing current diagnostics
>>>> my ($sec,$min,$hour,$mday,$mon,$year,
>>>>          $wday,$yday,$isdst) = localtime time;
>>>> $year += 1900;
>>>> $mon += 1;
>>>> my $pwd = cwd();
>>>> my $preview_diag_dir =
>> "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>>>> print "$preview_diag_dir\n";
>>>> mkdir $preview_diag_dir, 0755;
>>>> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir
>> $!\n";
>>>> 
>>>> # Capture Preview thread dump
>>>> my $process_pattern = "preview";
>>>> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>>>> my $login = getpwuid($<) ;
>>>> if (kill 0, $preview_pid){
>>>>        #Possible to send a signal to the Preview Tomcat - either
>> "webinf"
>>>> or "root"
>>>>        my $count = kill 3, $preview_pid;
>>>> }else {
>>>>        # Not possible to send a signal to the VCM - use "sudo"
>>>>        system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>>>> }
>>>> 
>>>> # Capture Preview thread dump
>>>> system ("/usr/bin/jmap
>>>> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof
>> $preview_pid");
>>>> 
>>>> # Gather the top threads; keep around for reference on what other
>> threads
>>>> are running
>>>> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>>>> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>>>> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>>>> system("@top_cmd 1> top_all_threads.log");
>>>> 
>>>> # Get your tomcat user's threads, i.e. threads of user, "webinf"
>>>> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k
>> "9,9" |
>>>> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>>> 
>>>> # Get the thread dump> <instaGrok_sml.png>
>> 
> 
> 
> -- 
> Jeff MAURY
> 
> 
> "Legacy code" often differs from its suggested alternative by actually
> working and scaling.
> - Bjarne Stroustrup
> 
> http://www.jeffmaury.com
> http://riadiscuss.jeffmaury.com
> http://www.twitter.com/jeffmaury

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: high CPU usage on tomcat 7

Posted by Jeff MAURY <je...@jeffmaury.com>.
I don't think a cpu loop will make tomcat stopping responding to requests
I will make it very slow to respond
But a shortage on memory is hard to recover

Jeff

Le vendredi 28 septembre 2012, mailinglist@j-b-s.de a écrit :

> Maybe an infinite loop? We observed something similar due to a bug in the
> java regex impl and certain user input causes this regex looping behaviour.
> As this locked one core but to the user it simply looked like server was
> not responding, guess what happend? Right: they press refresh page and next
> core gone :-)
> So running state might be evil, too.
> Switch on GC logging, maybe its just related to a full gc. In case this
> happens again take a thread dump and search similarities. Take multiple
> dumps in a 5 sec interval. Try to find long running threads (in our case we
> noticed the regex bug)
>
> Jens
>
> Sent from my iPhone
>
> On 27.09.2012, at 22:05, Kirill Kireyev <ki...@instagrok.com> wrote:
>
> > Thanks for all the advice everyone! There is a possibility that the CPU
> is caused by an app thread - I am looking into that possibility. Will let
> you know when I find out more.
> >
> > Thanks,
> > Kirill
> >
> > On 9/27/12 12:17 PM, Shanti Suresh wrote:
> >> Hi Kirill,
> >>
> >> Like Mark, Bill and Jeff said, those threads are normal
> request-processing
> >> threads.  I have included a script that might help with isolating high
> CPU
> >> issues with Tomcat.
> >>
> >> Also, I think it might be helpful to see how the Java heap is
> performing as
> >> well.
> >> Please bring up Jconsole and let it run over the week.  Inspect the
> graphs
> >> for Memory, CPU and threads.  Since you say that high CPU occurs
> >> intermittently several times during the week and clears itself, I
> wonder if
> >> it is somehow related with the garbage collection options you are using
> for
> >> the server.  Or it may be a code-related problem.
> >>
> >> Things to look at may include:
> >>
> >> (1) Are high CPU times related to Java heap reductions happening at the
> >> same time?  ==> GC possibly needs tuning
> >> (2) Are high CPU times related to increase in thread usage?  ==>
> possible
> >> livelock in looping code?
> >> (3) how many network connections come into the Tomcat server during
> >> high-CPU times?    Possible overload-related?
> >>
> >> Here is the script.  I made a couple of small changes, for e.g.,
> changing
> >> the username.  But didn't test it after the change.  During high-CPU
> times,
> >> invoke the script a few times, say 30 seconds apart.  And then compare
> the
> >> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
> >> thread-dumps.
> >>
> >> Mark, et al, please feel free to help me refine this script.  I would
> like
> >> to have a script to catch STUCK threads too :-)  Let me know if anyone
> has
> >> a script already.  Thanks.
> >>
> >> --------------high_cpu_diagnostics.pl:-----
> >> #!/usr/bin/perl
> >> #
> >>
> >> use Cwd;
> >>
> >> # Make a dated directory for capturing current diagnostics
> >> my ($sec,$min,$hour,$mday,$mon,$year,
> >>           $wday,$yday,$isdst) = localtime time;
> >> $year += 1900;
> >> $mon += 1;
> >> my $pwd = cwd();
> >> my $preview_diag_dir =
> "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
> >> print "$preview_diag_dir\n";
> >> mkdir $preview_diag_dir, 0755;
> >> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir
> $!\n";
> >>
> >> # Capture Preview thread dump
> >> my $process_pattern = "preview";
> >> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
> >> my $login = getpwuid($<) ;
> >> if (kill 0, $preview_pid){
> >>         #Possible to send a signal to the Preview Tomcat - either
> "webinf"
> >> or "root"
> >>         my $count = kill 3, $preview_pid;
> >> }else {
> >>         # Not possible to send a signal to the VCM - use "sudo"
> >>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
> >> }
> >>
> >> # Capture Preview thread dump
> >> system ("/usr/bin/jmap
> >> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof
> $preview_pid");
> >>
> >> # Gather the top threads; keep around for reference on what other
> threads
> >> are running
> >> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
> >> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
> >> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
> >> system("@top_cmd 1> top_all_threads.log");
> >>
> >> # Get your tomcat user's threads, i.e. threads of user, "webinf"
> >> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k
> "9,9" |
> >> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
> >>
> >> # Get the thread dump> <instaGrok_sml.png>
>


-- 
Jeff MAURY


"Legacy code" often differs from its suggested alternative by actually
working and scaling.
 - Bjarne Stroustrup

http://www.jeffmaury.com
http://riadiscuss.jeffmaury.com
http://www.twitter.com/jeffmaury

Re: high CPU usage on tomcat 7

Posted by "mailinglist@j-b-s.de" <ma...@j-b-s.de>.
Maybe an infinite loop? We observed something similar due to a bug in the java regex impl and certain user input causes this regex looping behaviour. As this locked one core but to the user it simply looked like server was not responding, guess what happend? Right: they press refresh page and next core gone :-)
So running state might be evil, too.
Switch on GC logging, maybe its just related to a full gc. In case this happens again take a thread dump and search similarities. Take multiple dumps in a 5 sec interval. Try to find long running threads (in our case we noticed the regex bug)

Jens

Sent from my iPhone

On 27.09.2012, at 22:05, Kirill Kireyev <ki...@instagrok.com> wrote:

> Thanks for all the advice everyone! There is a possibility that the CPU is caused by an app thread - I am looking into that possibility. Will let you know when I find out more.
> 
> Thanks,
> Kirill
> 
> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>> Hi Kirill,
>> 
>> Like Mark, Bill and Jeff said, those threads are normal request-processing
>> threads.  I have included a script that might help with isolating high CPU
>> issues with Tomcat.
>> 
>> Also, I think it might be helpful to see how the Java heap is performing as
>> well.
>> Please bring up Jconsole and let it run over the week.  Inspect the graphs
>> for Memory, CPU and threads.  Since you say that high CPU occurs
>> intermittently several times during the week and clears itself, I wonder if
>> it is somehow related with the garbage collection options you are using for
>> the server.  Or it may be a code-related problem.
>> 
>> Things to look at may include:
>> 
>> (1) Are high CPU times related to Java heap reductions happening at the
>> same time?  ==> GC possibly needs tuning
>> (2) Are high CPU times related to increase in thread usage?  ==> possible
>> livelock in looping code?
>> (3) how many network connections come into the Tomcat server during
>> high-CPU times?    Possible overload-related?
>> 
>> Here is the script.  I made a couple of small changes, for e.g., changing
>> the username.  But didn't test it after the change.  During high-CPU times,
>> invoke the script a few times, say 30 seconds apart.  And then compare the
>> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>> thread-dumps.
>> 
>> Mark, et al, please feel free to help me refine this script.  I would like
>> to have a script to catch STUCK threads too :-)  Let me know if anyone has
>> a script already.  Thanks.
>> 
>> --------------high_cpu_diagnostics.pl:-----
>> #!/usr/bin/perl
>> #
>> 
>> use Cwd;
>> 
>> # Make a dated directory for capturing current diagnostics
>> my ($sec,$min,$hour,$mday,$mon,$year,
>>           $wday,$yday,$isdst) = localtime time;
>> $year += 1900;
>> $mon += 1;
>> my $pwd = cwd();
>> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>> print "$preview_diag_dir\n";
>> mkdir $preview_diag_dir, 0755;
>> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>> 
>> # Capture Preview thread dump
>> my $process_pattern = "preview";
>> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>> my $login = getpwuid($<) ;
>> if (kill 0, $preview_pid){
>>         #Possible to send a signal to the Preview Tomcat - either "webinf"
>> or "root"
>>         my $count = kill 3, $preview_pid;
>> }else {
>>         # Not possible to send a signal to the VCM - use "sudo"
>>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>> }
>> 
>> # Capture Preview thread dump
>> system ("/usr/bin/jmap
>> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>> 
>> # Gather the top threads; keep around for reference on what other threads
>> are running
>> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>> system("@top_cmd 1> top_all_threads.log");
>> 
>> # Get your tomcat user's threads, i.e. threads of user, "webinf"
>> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>> 
>> # Get the thread dump
>> my @output=`/usr/bin/jstack -l ${preview_pid}`;
>> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>> $!";
>> print $file @output;
>> close $file;
>> 
>> open LOG, "top_user_webinf_threads.log" or die $!;
>> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>> print "PID\tCPU\tMem\tJStack Info\n";
>> while ($l = <LOG>) {
>>     chop $l;
>>     $pid = $l;
>>     $pid =~ s/webinf.*//g;
>>     $pid =~ s/ *//g;
>> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>     $hex_pid = sprintf("%#x", $pid);
>>     @values = split(/\s+/, $l);
>>     $pct = $values[8];
>>     $mem = $values[9];
>> # Debugger breakpoint:
>> $DB::single = 1;
>> 
>> # Find the Java thread that corresponds to the thread-id from the TOP output
>>     for my $j (@output) {
>>         chop $j;
>>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
>> $mem . "\t" .  $j . "\n";
>>     }
>> }
>> 
>> close (STDOUT);
>> 
>> close LOG;
>> 
>> 
>> ------------------end of script --------------------------
>> 
>> Thanks.
>> 
>>                       -Shanti
>> 
>> 
>> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
>> millebi.subscriptions@gmail.com> wrote:
>> 
>>> I agree; we have reproducible instances where PermGen is not set to our
>>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>>> every time. Do some JMX monitoring and you may discover a memory spike
>>> that's killing Tomcat.
>>> 
>>> Bill
>>> -----Original Message-----
>>> From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>>> Sent: September-27-2012 2:01 PM
>>> To: Tomcat Users List
>>> Subject: Re: high CPU usage on tomcat 7
>>> 
>>> This is probably due to out of memory, I have the same problem on my ubuntu
>>> ci machine Did you monitor your tomcat with jmx ?
>>> 
>>> Jeff
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>>> For additional commands, e-mail: users-help@tomcat.apache.org
>>> 
>>> 
>> 
> 
> 
> -- 
> Kirill Kireyev, PhD
> Founder/CTO instaGrok.com
> kirill@instagrok.com
> Twitter: @instaGrok
> FB: facebook.com/instagrok
> <instaGrok_sml.png>

Re: high CPU usage on tomcat 7

Posted by Kirill Kireyev <ki...@instagrok.com>.
Thanks for all the advice everyone! There is a possibility that the CPU 
is caused by an app thread - I am looking into that possibility. Will 
let you know when I find out more.

Thanks,
Kirill

On 9/27/12 12:17 PM, Shanti Suresh wrote:
> Hi Kirill,
>
> Like Mark, Bill and Jeff said, those threads are normal request-processing
> threads.  I have included a script that might help with isolating high CPU
> issues with Tomcat.
>
> Also, I think it might be helpful to see how the Java heap is performing as
> well.
> Please bring up Jconsole and let it run over the week.  Inspect the graphs
> for Memory, CPU and threads.  Since you say that high CPU occurs
> intermittently several times during the week and clears itself, I wonder if
> it is somehow related with the garbage collection options you are using for
> the server.  Or it may be a code-related problem.
>
> Things to look at may include:
>
> (1) Are high CPU times related to Java heap reductions happening at the
> same time?  ==> GC possibly needs tuning
> (2) Are high CPU times related to increase in thread usage?  ==> possible
> livelock in looping code?
> (3) how many network connections come into the Tomcat server during
> high-CPU times?    Possible overload-related?
>
> Here is the script.  I made a couple of small changes, for e.g., changing
> the username.  But didn't test it after the change.  During high-CPU times,
> invoke the script a few times, say 30 seconds apart.  And then compare the
> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
> thread-dumps.
>
> Mark, et al, please feel free to help me refine this script.  I would like
> to have a script to catch STUCK threads too :-)  Let me know if anyone has
> a script already.  Thanks.
>
> --------------high_cpu_diagnostics.pl:-----
> #!/usr/bin/perl
> #
>
> use Cwd;
>
> # Make a dated directory for capturing current diagnostics
> my ($sec,$min,$hour,$mday,$mon,$year,
>            $wday,$yday,$isdst) = localtime time;
> $year += 1900;
> $mon += 1;
> my $pwd = cwd();
> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
> print "$preview_diag_dir\n";
> mkdir $preview_diag_dir, 0755;
> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>
> # Capture Preview thread dump
> my $process_pattern = "preview";
> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
> my $login = getpwuid($<) ;
> if (kill 0, $preview_pid){
>          #Possible to send a signal to the Preview Tomcat - either "webinf"
> or "root"
>          my $count = kill 3, $preview_pid;
> }else {
>          # Not possible to send a signal to the VCM - use "sudo"
>          system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
> }
>
> # Capture Preview thread dump
> system ("/usr/bin/jmap
> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>
> # Gather the top threads; keep around for reference on what other threads
> are running
> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
> system("@top_cmd 1> top_all_threads.log");
>
> # Get your tomcat user's threads, i.e. threads of user, "webinf"
> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>
> # Get the thread dump
> my @output=`/usr/bin/jstack -l ${preview_pid}`;
> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
> $!";
> print $file @output;
> close $file;
>
> open LOG, "top_user_webinf_threads.log" or die $!;
> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
> print "PID\tCPU\tMem\tJStack Info\n";
> while ($l = <LOG>) {
>      chop $l;
>      $pid = $l;
>      $pid =~ s/webinf.*//g;
>      $pid =~ s/ *//g;
> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>      $hex_pid = sprintf("%#x", $pid);
>      @values = split(/\s+/, $l);
>      $pct = $values[8];
>      $mem = $values[9];
> # Debugger breakpoint:
> $DB::single = 1;
>
> # Find the Java thread that corresponds to the thread-id from the TOP output
>      for my $j (@output) {
>          chop $j;
>          ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
> $mem . "\t" .  $j . "\n";
>      }
> }
>
> close (STDOUT);
>
> close LOG;
>
>
> ------------------end of script --------------------------
>
> Thanks.
>
>                        -Shanti
>
>
> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
> millebi.subscriptions@gmail.com> wrote:
>
>> I agree; we have reproducible instances where PermGen is not set to our
>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>> every time. Do some JMX monitoring and you may discover a memory spike
>> that's killing Tomcat.
>>
>> Bill
>> -----Original Message-----
>> From: Jeff MAURY [mailto:jeffmaury@gmail.com]
>> Sent: September-27-2012 2:01 PM
>> To: Tomcat Users List
>> Subject: Re: high CPU usage on tomcat 7
>>
>> This is probably due to out of memory, I have the same problem on my ubuntu
>> ci machine Did you monitor your tomcat with jmx ?
>>
>> Jeff
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>


-- 
*Kirill Kireyev, PhD*
Founder/CTO instaGrok.com <http://www.instagrok.com>
kirill@instagrok.com
Twitter: @instaGrok <http://twitter.com/InstaGrok>
FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
<http://www.instagrok.com>

Re: high CPU usage on tomcat 7

Posted by Shanti Suresh <sh...@umich.edu>.
Hi Kirill,

Like Mark, Bill and Jeff said, those threads are normal request-processing
threads.  I have included a script that might help with isolating high CPU
issues with Tomcat.

Also, I think it might be helpful to see how the Java heap is performing as
well.
Please bring up Jconsole and let it run over the week.  Inspect the graphs
for Memory, CPU and threads.  Since you say that high CPU occurs
intermittently several times during the week and clears itself, I wonder if
it is somehow related with the garbage collection options you are using for
the server.  Or it may be a code-related problem.

Things to look at may include:

(1) Are high CPU times related to Java heap reductions happening at the
same time?  ==> GC possibly needs tuning
(2) Are high CPU times related to increase in thread usage?  ==> possible
livelock in looping code?
(3) how many network connections come into the Tomcat server during
high-CPU times?    Possible overload-related?

Here is the script.  I made a couple of small changes, for e.g., changing
the username.  But didn't test it after the change.  During high-CPU times,
invoke the script a few times, say 30 seconds apart.  And then compare the
thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
thread-dumps.

Mark, et al, please feel free to help me refine this script.  I would like
to have a script to catch STUCK threads too :-)  Let me know if anyone has
a script already.  Thanks.

--------------high_cpu_diagnostics.pl:-----
#!/usr/bin/perl
#

use Cwd;

# Make a dated directory for capturing current diagnostics
my ($sec,$min,$hour,$mday,$mon,$year,
          $wday,$yday,$isdst) = localtime time;
$year += 1900;
$mon += 1;
my $pwd = cwd();
my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
print "$preview_diag_dir\n";
mkdir $preview_diag_dir, 0755;
chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";

# Capture Preview thread dump
my $process_pattern = "preview";
my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
my $login = getpwuid($<) ;
if (kill 0, $preview_pid){
        #Possible to send a signal to the Preview Tomcat - either "webinf"
or "root"
        my $count = kill 3, $preview_pid;
}else {
        # Not possible to send a signal to the VCM - use "sudo"
        system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
}

# Capture Preview thread dump
system ("/usr/bin/jmap
-dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");

# Gather the top threads; keep around for reference on what other threads
are running
@top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
@sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
@sed_cmd = ("/bin/sed", "-n", "'8,$p'");
system("@top_cmd 1> top_all_threads.log");

# Get your tomcat user's threads, i.e. threads of user, "webinf"
system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
/bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');

# Get the thread dump
my @output=`/usr/bin/jstack -l ${preview_pid}`;
open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
$!";
print $file @output;
close $file;

open LOG, "top_user_webinf_threads.log" or die $!;
open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
print "PID\tCPU\tMem\tJStack Info\n";
while ($l = <LOG>) {
    chop $l;
    $pid = $l;
    $pid =~ s/webinf.*//g;
    $pid =~ s/ *//g;
##  Hex PID is available in the Sun HotSpot Stack Trace */
    $hex_pid = sprintf("%#x", $pid);
    @values = split(/\s+/, $l);
    $pct = $values[8];
    $mem = $values[9];
# Debugger breakpoint:
$DB::single = 1;

# Find the Java thread that corresponds to the thread-id from the TOP output
    for my $j (@output) {
        chop $j;
        ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t" .
$mem . "\t" .  $j . "\n";
    }
}

close (STDOUT);

close LOG;


------------------end of script --------------------------

Thanks.

                      -Shanti


On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <
millebi.subscriptions@gmail.com> wrote:

> I agree; we have reproducible instances where PermGen is not set to our
> requirements on the Tomcat startup parameters and it will cause a "lockup"
> every time. Do some JMX monitoring and you may discover a memory spike
> that's killing Tomcat.
>
> Bill
> -----Original Message-----
> From: Jeff MAURY [mailto:jeffmaury@gmail.com]
> Sent: September-27-2012 2:01 PM
> To: Tomcat Users List
> Subject: Re: high CPU usage on tomcat 7
>
> This is probably due to out of memory, I have the same problem on my ubuntu
> ci machine Did you monitor your tomcat with jmx ?
>
> Jeff
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

RE: high CPU usage on tomcat 7

Posted by Bill Miller <mi...@gmail.com>.
I agree; we have reproducible instances where PermGen is not set to our
requirements on the Tomcat startup parameters and it will cause a "lockup"
every time. Do some JMX monitoring and you may discover a memory spike
that's killing Tomcat.

Bill
-----Original Message-----
From: Jeff MAURY [mailto:jeffmaury@gmail.com] 
Sent: September-27-2012 2:01 PM
To: Tomcat Users List
Subject: Re: high CPU usage on tomcat 7

This is probably due to out of memory, I have the same problem on my ubuntu
ci machine Did you monitor your tomcat with jmx ?

Jeff
Le 27 sept. 2012 17:39, "Kirill Kireyev" <ki...@instagrok.com> a écrit :

> Hi!
>
> I'm periodically getting unduly high (100%) CPU usage by the tomcat 
> process on my server. This problems happens intermittently, several 
> times a week. When the server goes into this high CPU it does not come 
> back (and becomes unresponsive to new requests), and the only recourse 
> is to restart the tomcat process.
>
> I'm using Tomcat 7.0.30, with APR and Apache web server, on a Ubuntu 
> 11.10 server with 32g of RAM / 8 CPUs.
>
> I've done several jstack stack traces when this occurs, and what I 
> consistently see, are the connector threads in the RUNNABLE state 
> every time, i.e.:
>
> ajp-apr-8009-Acceptor-0" daemon prio=10 tid=0x00000000010a1000 
> nid=0x539 runnable [0x00007f9364f8e000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> "http-apr-8443-Acceptor-0" daemon prio=10 tid=0x000000000109b800 
> nid=0x535 runnable [0x00007f9365510000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> "http-apr-8080-Acceptor-0" daemon prio=10 tid=0x00000000015ab000 
> nid=0x531 runnable [0x00007f9365a92000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> Other threads are in RUNNBLE too in different cases, but these are the 
> one that are always there when the high CPU occurs. That's why I'm 
> starting to think it has something to do with Tomcat.
>
> Can anyone shed some light on this? My current Connector 
> configurations in server.xml are:
>  <Connector port="8080" protocol="org.apache.coyote.** 
> http11.Http11AprProtocol"
>                connectionTimeout="20000"
>                maxThreads="500" minSpareThreads="10" maxSpareThreads="20"
>                redirectPort="8443"
>                pollTime="100000" />
> ...
> <Connector port="8443" protocol="org.apache.coyote.** 
> http11.Http11AprProtocol"
>             maxThreads="200" scheme="https" secure="true"
SSLEnabled="true"
>             SSLCACertificateFile="**********"
>             SSLCertificateKeyFile="**********"
>             SSLCertificateFile="*******"
>             enableLookups="false" clientAuth="false" sslProtocol="TLS"
>             pollTime="100000" />
> ...
> <Connector port="8009" protocol="AJP/1.3" redirectPort="8443"
>            acceptCount="100" connectionTimeout="5000"
> keepAliveTimeout="20000"
>            disableUploadTimeout="true" enableLookups="false"
>            maxHttpHeaderSize="8192"
>            maxSpareThreads="75" maxThreads="150"
>            minSpareThreads="25"
>            executor="default" />
>
> Thanks a lot!
> -Kirill
>
> ------------------------------**------------------------------**------
> --- To unsubscribe, e-mail: 
> users-unsubscribe@tomcat.**apache.org<users-unsubscribe@tomcat.apache.
> org> For additional commands, e-mail: users-help@tomcat.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: high CPU usage on tomcat 7

Posted by Jeff MAURY <je...@gmail.com>.
This is probably due to out of memory, I have the same problem on my ubuntu
ci machine
Did you monitor your tomcat with jmx ?

Jeff
Le 27 sept. 2012 17:39, "Kirill Kireyev" <ki...@instagrok.com> a écrit :

> Hi!
>
> I'm periodically getting unduly high (100%) CPU usage by the tomcat
> process on my server. This problems happens intermittently, several times a
> week. When the server goes into this high CPU it does not come back (and
> becomes unresponsive to new requests), and the only recourse is to restart
> the tomcat process.
>
> I'm using Tomcat 7.0.30, with APR and Apache web server, on a Ubuntu 11.10
> server with 32g of RAM / 8 CPUs.
>
> I've done several jstack stack traces when this occurs, and what I
> consistently see, are the connector threads in the RUNNABLE state every
> time, i.e.:
>
> ajp-apr-8009-Acceptor-0" daemon prio=10 tid=0x00000000010a1000 nid=0x539
> runnable [0x00007f9364f8e000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> "http-apr-8443-Acceptor-0" daemon prio=10 tid=0x000000000109b800 nid=0x535
> runnable [0x00007f9365510000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> "http-apr-8080-Acceptor-0" daemon prio=10 tid=0x00000000015ab000 nid=0x531
> runnable [0x00007f9365a92000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.**accept(Native Method)
>         at org.apache.tomcat.util.net.**AprEndpoint$Acceptor.run(**
> AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.**java:722)
>
> Other threads are in RUNNBLE too in different cases, but these are the one
> that are always there when the high CPU occurs. That's why I'm starting to
> think it has something to do with Tomcat.
>
> Can anyone shed some light on this? My current Connector configurations in
> server.xml are:
>  <Connector port="8080" protocol="org.apache.coyote.**
> http11.Http11AprProtocol"
>                connectionTimeout="20000"
>                maxThreads="500" minSpareThreads="10" maxSpareThreads="20"
>                redirectPort="8443"
>                pollTime="100000" />
> ...
> <Connector port="8443" protocol="org.apache.coyote.**
> http11.Http11AprProtocol"
>             maxThreads="200" scheme="https" secure="true" SSLEnabled="true"
>             SSLCACertificateFile="**********"
>             SSLCertificateKeyFile="**********"
>             SSLCertificateFile="*******"
>             enableLookups="false" clientAuth="false" sslProtocol="TLS"
>             pollTime="100000" />
> ...
> <Connector port="8009" protocol="AJP/1.3" redirectPort="8443"
>            acceptCount="100" connectionTimeout="5000"
> keepAliveTimeout="20000"
>            disableUploadTimeout="true" enableLookups="false"
>            maxHttpHeaderSize="8192"
>            maxSpareThreads="75" maxThreads="150"
>            minSpareThreads="25"
>            executor="default" />
>
> Thanks a lot!
> -Kirill
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.**apache.org<us...@tomcat.apache.org>
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>

Re: high CPU usage on tomcat 7

Posted by Mark Thomas <ma...@apache.org>.

Kirill Kireyev <ki...@instagrok.com> wrote:

>Hi!
>
>I'm periodically getting unduly high (100%) CPU usage by the tomcat 
>process on my server. This problems happens intermittently, several 
>times a week. When the server goes into this high CPU it does not come 
>back (and becomes unresponsive to new requests), and the only recourse 
>is to restart the tomcat process.
>
>I'm using Tomcat 7.0.30, with APR and Apache web server, on a Ubuntu 
>11.10 server with 32g of RAM / 8 CPUs.
>
>I've done several jstack stack traces when this occurs, and what I 
>consistently see, are the connector threads in the RUNNABLE state every
>
>time, i.e.:
>
>ajp-apr-8009-Acceptor-0" daemon prio=10 tid=0x00000000010a1000
>nid=0x539 
>runnable [0x00007f9364f8e000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.accept(Native Method)
>         at 
>org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.java:722)
>
>"http-apr-8443-Acceptor-0" daemon prio=10 tid=0x000000000109b800 
>nid=0x535 runnable [0x00007f9365510000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.accept(Native Method)
>         at 
>org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.java:722)
>
>"http-apr-8080-Acceptor-0" daemon prio=10 tid=0x00000000015ab000 
>nid=0x531 runnable [0x00007f9365a92000]
>    java.lang.Thread.State: RUNNABLE
>         at org.apache.tomcat.jni.Socket.accept(Native Method)
>         at 
>org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1013)
>         at java.lang.Thread.run(Thread.java:722)
>
>Other threads are in RUNNBLE too in different cases, but these are the 
>one that are always there when the high CPU occurs. That's why I'm 
>starting to think it has something to do with Tomcat.

Those threads look ok to me. As acceptor threads that is what i would expect.

>Can anyone shed some light on this?

With the information you have provided? Very unlikely.

What you need to do is use ps to look at CPU usage per thread (not per process) and then match the offending thread ID to the thread ID in the thread dump.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org