You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@rocketmq.apache.org by "zhaoziyan (JIRA)" <ji...@apache.org> on 2017/12/19 11:55:00 UTC

[jira] [Updated] (ROCKETMQ-337) rocketmq 3.4.6 version Full GC

     [ https://issues.apache.org/jira/browse/ROCKETMQ-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhaoziyan updated ROCKETMQ-337:
-------------------------------
    Description: 
-server -Xms4g -Xmx4g -Xmn2g -XX:PermSize=128m -XX:MaxPermSize=320m -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:SurvivorRatio=8 -XX:+DisableExplicitGC -verbose:gc -Xloggc:/home/work/rmq_bk_gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps
Online MQ service have suffered Full GC . QPS(read 5000,write 1000)
Our MQ Version is 3.4.6 what the picture mean ?? 
{color:red}From *http://jm.taobao.org/2017/03/23/20170323/{color}
*


{color:red}线程池
Java应用里面总会有各式各样的线程池,运用线程池最需要考虑的两个因素便是:
线程池的个数,避免设置过多或过少的线程池数,过少会导致CPU资源利用率不够吞吐量低,过多的线程池会带来更多的同步原语、上下文切换、调度等方面的性能损失。
线程池的划分,需要根据具体的业务或者模块做详细的规划,线程池往往也起到了资源隔离的作用,RocketMQ中曾有一个重要模块和一个非重要模块共享一个线程池,在去年双十一的压测中,非重要模块因压力大占据了大部分的线程池资源,导致重要模块的业务发生饥饿,最终导致了无法恢复的密集FGC。{color}
2017-12-17T10:40:33.321+0800: 23498635.249: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3971690K(3984640K), 2.1143720 secs] [Times: user=2.12 sys=0.00, real=2.11 secs] 
2017-12-17T10:40:35.436+0800: 23498637.364: [CMS-concurrent-mark-start]
2017-12-17T10:40:35.676+0800: 23498637.604: [GC 23498637.604: [ParNew: 1887488K->1887488K(1887488K), 0.0000390 secs]23498637.605: [CMS2017-12-17T10:40:35.972+0800: 23498637.899: [CMS-concurrent-mark: 0.527/0.535 secs] [Times: user=4.62 sys=0.00, real=0.54 secs] 
2017-12-17T10:40:46.046+0800: 23498647.973: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3857647K(3984640K), 2.1016750 secs] [Times: user=2.10 sys=0.00, real=2.10 secs] 
2017-12-17T10:40:48.148+0800: 23498650.075: [CMS-concurrent-mark-start]
2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-mark: 0.591/0.591 secs] [Times: user=7.33 sys=0.00, real=0.59 secs] 
2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-preclean-start]
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-preclean: 1.522/1.522 secs] [Times: user=10.23 sys=0.55, real=1.52 secs] 
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean-start]
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2017-12-17T10:40:50.267+0800: 23498652.195: [GC[YG occupancy: 1869460 K (1887488 K)]23498652.195: [Rescan (parallel) , 0.8927400 secs]23498653.088: [weak refs processing, 0.0000200 secs]23498653.088: [class unloading, 0.0060500 secs]23498653.094: [scrub symbol & string tables, 0.0034370 secs] [1 CMS-remark: 2097150K(2097152K)] 3966610K(3984640K), 0.9055860 secs] [Times: user=19.33 sys=0.00, real=0.91 secs] 
2017-12-17T10:40:51.173+0800: 23498653.101: [CMS-concurrent-sweep-start]
2017-12-17T10:40:51.546+0800: 23498653.473: [Full GC 23498653.474: [CMS2017-12-17T10:40:51.829+0800: 23498653.757: [CMS-concurrent-sweep: 0.647/0.656 secs] [Times: user=2.85 sys=0.00, real=0.65 secs] 
2017-12-17T10:40:58.571+0800: 23498660.498: [GC [1 CMS-initial-mark: 2097151K(2097152K)] 3858845K(3984640K), 2.2749480 secs] [Times: user=2.27 sys=0.00, real=2.28 secs] 
2017-12-17T10:41:00.846+0800: 23498662.774: [CMS-concurrent-mark-start]
2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-mark: 0.538/0.538 secs] [Times: user=6.65 sys=0.00, real=0.53 secs] 
2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-preclean-start]
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-preclean: 0.880/0.880 secs] [Times: user=6.27 sys=0.29, real=0.89 secs] 
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean-start]
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2017-12-17T10:41:02.269+0800: 23498664.197: [GC[YG occupancy: 1843312 K (1887488 K)]23498664.197: [Rescan (parallel) , 1.2826410 secs]23498665.479: [weak refs processing, 0.0000200 secs]23498665.479: [class unloading, 0.0061050 secs]23498665.486: [scrub symbol & string tables, 0.0033500 secs] [1 CMS-remark: 2097151K(2097152K)] 3940464K(3984640K), 1.2953810 secs] [Times: user=27.66 sys=0.00, real=1.29 secs] 
2017-12-17T10:41:03.565+0800: 23498665.492: [CMS-concurrent-sweep-start]
2017-12-17T10:41:04.034+0800: 23498665.962: [CMS-concurrent-sweep: 0.470/0.470 secs] [Times: user=3.33 sys=0.00, real=0.46 secs]
!企业微信截图_1513682096359.png|thumbnail!
!企业微信截图_15136820099900.png|thumbnail!
*{color:red}*What thread pool cause the Full GC. please tell me .So I can fix the Problem **{color}

  was:
-server -Xms4g -Xmx4g -Xmn2g -XX:PermSize=128m -XX:MaxPermSize=320m -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:SurvivorRatio=8 -XX:+DisableExplicitGC -verbose:gc -Xloggc:/home/work/rmq_bk_gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps
Online MQ service have suffered Full GC . QPS(read 5000,write 1000)
Our MQ Version is 3.4.6 what the picture mean ?? 
{color:red}From *http://jm.taobao.org/2017/03/23/20170323/{color}
*


{color:red}线程池
Java应用里面总会有各式各样的线程池,运用线程池最需要考虑的两个因素便是:
线程池的个数,避免设置过多或过少的线程池数,过少会导致CPU资源利用率不够吞吐量低,过多的线程池会带来更多的同步原语、上下文切换、调度等方面的性能损失。
线程池的划分,需要根据具体的业务或者模块做详细的规划,线程池往往也起到了资源隔离的作用,RocketMQ中曾有一个重要模块和一个非重要模块共享一个线程池,在去年双十一的压测中,非重要模块因压力大占据了大部分的线程池资源,导致重要模块的业务发生饥饿,最终导致了无法恢复的密集FGC。{color}
2017-12-17T10:40:33.321+0800: 23498635.249: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3971690K(3984640K), 2.1143720 secs] [Times: user=2.12 sys=0.00, real=2.11 secs] 
2017-12-17T10:40:35.436+0800: 23498637.364: [CMS-concurrent-mark-start]
2017-12-17T10:40:35.676+0800: 23498637.604: [GC 23498637.604: [ParNew: 1887488K->1887488K(1887488K), 0.0000390 secs]23498637.605: [CMS2017-12-17T10:40:35.972+0800: 23498637.899: [CMS-concurrent-mark: 0.527/0.535 secs] [Times: user=4.62 sys=0.00, real=0.54 secs] 
2017-12-17T10:40:46.046+0800: 23498647.973: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3857647K(3984640K), 2.1016750 secs] [Times: user=2.10 sys=0.00, real=2.10 secs] 
2017-12-17T10:40:48.148+0800: 23498650.075: [CMS-concurrent-mark-start]
2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-mark: 0.591/0.591 secs] [Times: user=7.33 sys=0.00, real=0.59 secs] 
2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-preclean-start]
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-preclean: 1.522/1.522 secs] [Times: user=10.23 sys=0.55, real=1.52 secs] 
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean-start]
2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2017-12-17T10:40:50.267+0800: 23498652.195: [GC[YG occupancy: 1869460 K (1887488 K)]23498652.195: [Rescan (parallel) , 0.8927400 secs]23498653.088: [weak refs processing, 0.0000200 secs]23498653.088: [class unloading, 0.0060500 secs]23498653.094: [scrub symbol & string tables, 0.0034370 secs] [1 CMS-remark: 2097150K(2097152K)] 3966610K(3984640K), 0.9055860 secs] [Times: user=19.33 sys=0.00, real=0.91 secs] 
2017-12-17T10:40:51.173+0800: 23498653.101: [CMS-concurrent-sweep-start]
2017-12-17T10:40:51.546+0800: 23498653.473: [Full GC 23498653.474: [CMS2017-12-17T10:40:51.829+0800: 23498653.757: [CMS-concurrent-sweep: 0.647/0.656 secs] [Times: user=2.85 sys=0.00, real=0.65 secs] 
2017-12-17T10:40:58.571+0800: 23498660.498: [GC [1 CMS-initial-mark: 2097151K(2097152K)] 3858845K(3984640K), 2.2749480 secs] [Times: user=2.27 sys=0.00, real=2.28 secs] 
2017-12-17T10:41:00.846+0800: 23498662.774: [CMS-concurrent-mark-start]
2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-mark: 0.538/0.538 secs] [Times: user=6.65 sys=0.00, real=0.53 secs] 
2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-preclean-start]
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-preclean: 0.880/0.880 secs] [Times: user=6.27 sys=0.29, real=0.89 secs] 
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean-start]
2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
* 2017-12-17T10:41:02.269+0800: 23498664.197: [GC[YG occupancy: 1843312 K (1887488 K)]23498664.197: [Rescan (parallel) , 1.2826410 secs]23498665.479: [weak refs processing, 0.0000200 secs]23498665.479: [class unloading, 0.0061050 secs]23498665.486: [scrub symbol & string tables, 0.0033500 secs] [1 CMS-remark: 2097151K(2097152K)] 3940464K(3984640K), 1.2953810 secs] [Times: user=27.66 sys=0.00, real=1.29 secs] 
2017-12-17T10:41:03.565+0800: 23498665.492: [CMS-concurrent-sweep-start]
2017-12-17T10:41:04.034+0800: 23498665.962: [CMS-concurrent-sweep: 0.470/0.470 secs] [Times: user=3.33 sys=0.00, real=0.46 secs]
!企业微信截图_1513682096359.png|thumbnail!
!企业微信截图_15136820099900.png|thumbnail!
*{color:red}*What thread pool cause the Full GC. please tell me .So I can fix the Problem **{color}


> rocketmq 3.4.6 version Full GC 
> -------------------------------
>
>                 Key: ROCKETMQ-337
>                 URL: https://issues.apache.org/jira/browse/ROCKETMQ-337
>             Project: Apache RocketMQ
>          Issue Type: Wish
>          Components: rocketmq-broker
>         Environment: linux jdk 1.6 CMS GC
>            Reporter: zhaoziyan
>            Assignee: yukon
>         Attachments: 企业微信截图_15136820099900.png, 企业微信截图_1513682096359.png
>
>
> -server -Xms4g -Xmx4g -Xmn2g -XX:PermSize=128m -XX:MaxPermSize=320m -XX:+UseConcMarkSweepGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:SurvivorRatio=8 -XX:+DisableExplicitGC -verbose:gc -Xloggc:/home/work/rmq_bk_gc.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps
> Online MQ service have suffered Full GC . QPS(read 5000,write 1000)
> Our MQ Version is 3.4.6 what the picture mean ?? 
> {color:red}From *http://jm.taobao.org/2017/03/23/20170323/{color}
> *
> {color:red}线程池
> Java应用里面总会有各式各样的线程池,运用线程池最需要考虑的两个因素便是:
> 线程池的个数,避免设置过多或过少的线程池数,过少会导致CPU资源利用率不够吞吐量低,过多的线程池会带来更多的同步原语、上下文切换、调度等方面的性能损失。
> 线程池的划分,需要根据具体的业务或者模块做详细的规划,线程池往往也起到了资源隔离的作用,RocketMQ中曾有一个重要模块和一个非重要模块共享一个线程池,在去年双十一的压测中,非重要模块因压力大占据了大部分的线程池资源,导致重要模块的业务发生饥饿,最终导致了无法恢复的密集FGC。{color}
> 2017-12-17T10:40:33.321+0800: 23498635.249: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3971690K(3984640K), 2.1143720 secs] [Times: user=2.12 sys=0.00, real=2.11 secs] 
> 2017-12-17T10:40:35.436+0800: 23498637.364: [CMS-concurrent-mark-start]
> 2017-12-17T10:40:35.676+0800: 23498637.604: [GC 23498637.604: [ParNew: 1887488K->1887488K(1887488K), 0.0000390 secs]23498637.605: [CMS2017-12-17T10:40:35.972+0800: 23498637.899: [CMS-concurrent-mark: 0.527/0.535 secs] [Times: user=4.62 sys=0.00, real=0.54 secs] 
> 2017-12-17T10:40:46.046+0800: 23498647.973: [GC [1 CMS-initial-mark: 2097150K(2097152K)] 3857647K(3984640K), 2.1016750 secs] [Times: user=2.10 sys=0.00, real=2.10 secs] 
> 2017-12-17T10:40:48.148+0800: 23498650.075: [CMS-concurrent-mark-start]
> 2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-mark: 0.591/0.591 secs] [Times: user=7.33 sys=0.00, real=0.59 secs] 
> 2017-12-17T10:40:48.739+0800: 23498650.667: [CMS-concurrent-preclean-start]
> 2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-preclean: 1.522/1.522 secs] [Times: user=10.23 sys=0.55, real=1.52 secs] 
> 2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean-start]
> 2017-12-17T10:40:50.261+0800: 23498652.189: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
> 2017-12-17T10:40:50.267+0800: 23498652.195: [GC[YG occupancy: 1869460 K (1887488 K)]23498652.195: [Rescan (parallel) , 0.8927400 secs]23498653.088: [weak refs processing, 0.0000200 secs]23498653.088: [class unloading, 0.0060500 secs]23498653.094: [scrub symbol & string tables, 0.0034370 secs] [1 CMS-remark: 2097150K(2097152K)] 3966610K(3984640K), 0.9055860 secs] [Times: user=19.33 sys=0.00, real=0.91 secs] 
> 2017-12-17T10:40:51.173+0800: 23498653.101: [CMS-concurrent-sweep-start]
> 2017-12-17T10:40:51.546+0800: 23498653.473: [Full GC 23498653.474: [CMS2017-12-17T10:40:51.829+0800: 23498653.757: [CMS-concurrent-sweep: 0.647/0.656 secs] [Times: user=2.85 sys=0.00, real=0.65 secs] 
> 2017-12-17T10:40:58.571+0800: 23498660.498: [GC [1 CMS-initial-mark: 2097151K(2097152K)] 3858845K(3984640K), 2.2749480 secs] [Times: user=2.27 sys=0.00, real=2.28 secs] 
> 2017-12-17T10:41:00.846+0800: 23498662.774: [CMS-concurrent-mark-start]
> 2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-mark: 0.538/0.538 secs] [Times: user=6.65 sys=0.00, real=0.53 secs] 
> 2017-12-17T10:41:01.384+0800: 23498663.312: [CMS-concurrent-preclean-start]
> 2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-preclean: 0.880/0.880 secs] [Times: user=6.27 sys=0.29, real=0.89 secs] 
> 2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean-start]
> 2017-12-17T10:41:02.265+0800: 23498664.192: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
> 2017-12-17T10:41:02.269+0800: 23498664.197: [GC[YG occupancy: 1843312 K (1887488 K)]23498664.197: [Rescan (parallel) , 1.2826410 secs]23498665.479: [weak refs processing, 0.0000200 secs]23498665.479: [class unloading, 0.0061050 secs]23498665.486: [scrub symbol & string tables, 0.0033500 secs] [1 CMS-remark: 2097151K(2097152K)] 3940464K(3984640K), 1.2953810 secs] [Times: user=27.66 sys=0.00, real=1.29 secs] 
> 2017-12-17T10:41:03.565+0800: 23498665.492: [CMS-concurrent-sweep-start]
> 2017-12-17T10:41:04.034+0800: 23498665.962: [CMS-concurrent-sweep: 0.470/0.470 secs] [Times: user=3.33 sys=0.00, real=0.46 secs]
> !企业微信截图_1513682096359.png|thumbnail!
> !企业微信截图_15136820099900.png|thumbnail!
> *{color:red}*What thread pool cause the Full GC. please tell me .So I can fix the Problem **{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)