You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@eagle.apache.org by "Libin, Sun (JIRA)" <ji...@apache.org> on 2015/12/30 05:06:49 UTC

[jira] [Comment Edited] (EAGLE-97) Enable GC Log monitoring for important service like hadoop namenode

    [ https://issues.apache.org/jira/browse/EAGLE-97?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074608#comment-15074608 ] 

Libin, Sun edited comment on EAGLE-97 at 12/30/15 4:05 AM:
-----------------------------------------------------------

In GC log monitoring, the key information is stop-the-world event 

The following is some sample log from ParaNew GC, CMS GC & G1 GC

{code:title=ParaNew GC sample log}
2014-06-04T22:21:19.158-0700: 9.952: [GC 9.952: [ParNew: 2365777K->5223K(2831168K), 0.0155080 secs] 2365777K->5223K(100348736K), 0.0156030 secs] [Times: user=0.08 sys=0.05, real=0.02 secs]
{code}

{code:title=CMS GC sample log}
2014-06-04T22:47:31.218-0700: 1582.012: [GC [1 CMS-initial-mark: 78942227K(97517568K)] 79264643K(100348736K), 0.2334170 secs] [Times: user=0.23 sys=0.00, real=0.24 secs]

2014-06-04T22:49:50.603-0700: 1721.397: [GC[YG occupancy: 2777944 K (2831168 K)]1721.398: [Rescan (parallel) , 0.1706730 secs]1721.568: [weak refs processing, 0.0156130 secs] [1 CMS-remark: 83730081K(97517568K)] 86508026K(100348736K), 0.1868130 secs] [Times: user=3.04 sys=0.01, real=0.18 secs]
{code}

{code:title=G1 GC sample log}
0.522: [GC pause (young), 0.15877971 secs]
1.730: [GC pause (mixed), 0.32714353 secs]
[Eden: 12M(12M)->0B(13M) Survivors: 0B->2048K Heap: 14M(64M)->9739K(64M)]
{code}

>From the log, we can extract the key information as the following stream definition for alert engine/metric generator to process

{code:title=gc log stream definition}
 timestamp                             long
 eventType                            string
 pausedGCTimeSec            double
 youngAreaGCed                 boolean
 youngUsedHeapK              long
 youngTotalHeapK               long
 tenuredAreaGCed               boolean
 tenuredUsedHeapK            long
 tenuredTotalHeapK             long
 permAreaGCed                   boolean
 permUsedHeapK                long
 permTotalHeapK                 long
 totalHeapUsageAvailable   boolean
 usedTotalHeapK                  long
 totalHeapK                            long
 logLine                                  string
 {code}


was (Author: libsun):
In GC log monitoring, the key information is stop-the-world event 

The following is some sample log from ParaNew GC, CMS GC & G1 GC

{code:title=ParaNew GC sample log}
2014-06-04T22:21:19.158-0700: 9.952: [GC 9.952: [ParNew: 2365777K->5223K(2831168K), 0.0155080 secs] 2365777K->5223K(100348736K), 0.0156030 secs] [Times: user=0.08 sys=0.05, real=0.02 secs]
{code}

{code:title=CMS GC sample log}
2014-06-04T22:47:31.218-0700: 1582.012: [GC [1 CMS-initial-mark: 78942227K(97517568K)] 79264643K(100348736K), 0.2334170 secs] [Times: user=0.23 sys=0.00, real=0.24 secs]

2014-06-04T22:49:50.603-0700: 1721.397: [GC[YG occupancy: 2777944 K (2831168 K)]1721.398: [Rescan (parallel) , 0.1706730 secs]1721.568: [weak refs processing, 0.0156130 secs] [1 CMS-remark: 83730081K(97517568K)] 86508026K(100348736K), 0.1868130 secs] [Times: user=3.04 sys=0.01, real=0.18 secs]
{code}

{code:title=G1 GC sample log}
0.522: [GC pause (young), 0.15877971 secs]
1.730: [GC pause (mixed), 0.32714353 secs]
[Eden: 12M(12M)->0B(13M) Survivors: 0B->2048K Heap: 14M(64M)->9739K(64M)]
{code}

>From the log, we can extract the key information as the following stream definition for alert engine/metric generator to process

{code:title=gc log stream definition}
 timestamp 		    	        long
 eventType        		        string
 pausedGCTimeSec 	        double
 youngAreaGCed	         	boolean
 youngUsedHeapK  	        long
 youngTotalHeapK   		long
 tenuredAreaGCed  		boolean
 tenuredUsedHeapK  		long
 tenuredTotalHeapK 		long
 permAreaGCed	      		boolean
 permUsedHeapK	    	        long
 permTotalHeapK   		long
 totalHeapUsageAvailable   boolean
 usedTotalHeapK	   		long
 totalHeapK		   		long
 logLine					string
 {code}

> Enable GC Log monitoring for important service like hadoop namenode
> -------------------------------------------------------------------
>
>                 Key: EAGLE-97
>                 URL: https://issues.apache.org/jira/browse/EAGLE-97
>             Project: Eagle
>          Issue Type: New Feature
>    Affects Versions: 0.3.0
>            Reporter: Libin, Sun
>            Assignee: Libin, Sun
>
> Garbage Collection Monitoring refers to the process of figuring out how JVM is running GC. 
> When GC happened, JVM will stop the application from running to execute a GC, every thread except for the threads needed for the GC will stop their tasks. The interrupted tasks will resume only after the GC task has completed, the stop interval is known as "stop-the-world"
> For service like namenode, GC will affect the performance, especially full GC, we should avoid full GC and if full GC happened, we should detected it ASAP and sent out alert



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)