You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Petersen, Robert" <ro...@buy.com> on 2012/12/19 01:53:32 UTC

occasional GC crashes

Hi solr user group,

Sorry if this isn't directly a Solr question.  Seems like once in a blue moon the GC crashes on a server in our Solr 3.6.1 slave farm.  This seems to only happen on a couple of the twelve slaves we have deployed and only very rarely on those.  It seems like this doesn't directly affect solr because in the logs it looks like solr keeps working after the time of the exception but our external monitoring tool reports that the solr service is down so our operations department restarts solr on that box and alerts me.  The solr logs show nothing unusual.  The exception does show up in the catalina.out log file though.  Does this happen to anyone else?  Here is the basic error and I have attached the crash dump file also.   Our total uptime on these boxes is over a year now BTW.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00002b5379346612, pid=13724, tid=1082353984
#
# JRE version: 6.0_25-b06
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode linux-amd64 )
# Problematic frame:
# V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned long)+0x82
#
# An error report file with more information is saved as:
# /var/LucidWorks/lucidworks/hs_err_pid13724.log
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

VM Arguments:
jvm_args: -Djava.util.logging.config.file=/var/LucidWorks/lucidworks/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Xmx32768m -Xms32768m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=6060 -Djava.endorsed.dirs=/var/LucidWorks/lucidworks/tomcat/endorsed -Dcatalina.base=/var/LucidWorks/lucidworks/tomcat -Dcatalina.home=/var/LucidWorks/lucidworks/tomcat -Djava.io.tmpdir=/var/LucidWorks/lucidworks/tomcat/temp
java_command: org.apache.catalina.startup.Bootstrap -server -Dsolr.solr.home=lucidworks/solr start
Launcher Type: SUN_STANDARD

Stack: [0x0000000000000000,0x0000000000000000],  sp=0x0000000040835eb0,  free space=1056983k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned long)+0x82
V  [libjvm.so+0x3c481a]  CMSConcMarkingTask::do_work_steal(int)+0xfa
V  [libjvm.so+0x3c3dcf]  CMSConcMarkingTask::work(int)+0xef
V  [libjvm.so+0x8783dc]  YieldingFlexibleGangWorker::loop()+0xbc
V  [libjvm.so+0x8755b4]  GangWorker::run()+0x24
V  [libjvm.so+0x71096f]  java_start(Thread*)+0x13f

Heap
par new generation   total 345024K, used 180672K [0x00002aaaae120000, 0x00002aaac5780000, 0x00002aaac5780000)
  eden space 306688K,  53% used [0x00002aaaae120000, 0x00002aaab8243c28, 0x00002aaac0ca0000)
  from space 38336K,  40% used [0x00002aaac3210000, 0x00002aaac415c3f8, 0x00002aaac5780000)
  to   space 38336K,   0% used [0x00002aaac0ca0000, 0x00002aaac0ca0000, 0x00002aaac3210000)
concurrent mark-sweep generation total 33171072K, used 12144213K [0x00002aaac5780000, 0x00002ab2ae120000, 0x00002ab2ae120000)
concurrent-mark-sweep perm gen total 83968K, used 50650K [0x00002ab2ae120000, 0x00002ab2b3320000, 0x00002ab2b3320000)

Code Cache  [0x00002aaaab054000, 0x00002aaaab9a4000, 0x00002aaaae054000)
total_blobs=2800 nmethods=2273 adapters=480 free_code_cache=40752512 largest_free_block=15808



Thanks,

Robert (Robi) Petersen
Senior Software Engineer
Search Department


RE: occasional GC crashes

Posted by "Petersen, Robert" <ro...@buy.com>.
Thanks Otis, will do.

-----Original Message-----
From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com] 
Sent: Thursday, December 20, 2012 7:01 PM
To: solr-user@lucene.apache.org
Subject: RE: occasional GC crashes

Hi Robi,

Oh that's the thing of the past, go for the latest Java 7 if they let you!

Otis
--
Performance Monitoring - http://sematext.com/spm On Dec 20, 2012 6:29 PM, "Petersen, Robert" <ro...@buy.com> wrote:

> Hi Otis,
>
> I thought Java 7 had a bug which wasn't being addressed by Oracle 
> which was making it not suitable for Solr.  Did that get fixed now?
> http://searchhub.org/2011/07/28/dont-use-java-7-for-anything/
>
> I did see this but it doesn't really mention the bug:
> http://opensearchnews.com/2012/04/announcing-java7-support-with-apache
> -solr-and-lucene/
>
> Thanks
> Robi
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com]
> Sent: Tuesday, December 18, 2012 5:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: occasional GC crashes
>
> Robert,
>
> Step 1 is to get the latest Java 7 or if you have to remain on 6 then 
> use the latest 6.
>
> Otis
> --
> SOLR Performance Monitoring - http://sematext.com/spm On Dec 18, 2012
> 7:54 PM, "Petersen, Robert" <ro...@buy.com> wrote:
>
> >  Hi solr user group,****
> >
> > ** **
> >
> > Sorry if this isn't directly a Solr question.  Seems like once in a 
> > blue moon the GC crashes on a server in our Solr 3.6.1 slave farm.
> > This seems to only happen on a couple of the twelve slaves we have 
> > deployed and only very rarely on those.  It seems like this doesn't 
> > directly affect solr because in the logs it looks like solr keeps 
> > working after the time of the exception but our external monitoring 
> > tool reports that the solr service is down so our operations 
> > department
> restarts solr on that box and alerts me.
> > The solr logs show nothing unusual.  The exception does show up in 
> > the catalina.out log file though.  Does this happen to anyone else?  Here is
> > the basic error and I have attached the crash dump file also.   Our total
> > uptime on these boxes is over a year now BTW.****
> >
> > ** **
> >
> > #****
> >
> > # A fatal error has been detected by the Java Runtime 
> > Environment:****
> >
> > #****
> >
> > #  SIGSEGV (0xb) at pc=0x00002b5379346612, pid=13724,
> > tid=1082353984****
> >
> > #****
> >
> > # JRE version: 6.0_25-b06****
> >
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode
> > linux-amd64 )****
> >
> > # Problematic frame:****
> >
> > # V  [libjvm.so+0x3c4612]  
> > Par_ConcMarkingClosure::trim_queue(unsigned
> > long)+0x82****
> >
> > #****
> >
> > # An error report file with more information is saved as:****
> >
> > # /var/LucidWorks/lucidworks/hs_err_pid13724.log****
> >
> > #****
> >
> > # If you would like to submit a bug report, please visit:****
> >
> > #   http://java.sun.com/webapps/bugreport/crash.jsp****
> >
> > #****
> >
> > ** **
> >
> > VM Arguments:****
> >
> > jvm_args:
> > -Djava.util.logging.config.file=/var/LucidWorks/lucidworks/tomcat/co
> > nf
> > /logging.properties
> > -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> > -Xmx32768m -Xms32768m -XX:+UseConcMarkSweepGC 
> > -XX:+CMSIncrementalMode -Dcom.sun.management.jmxremote 
> > -Dcom.sun.management.jmxremote.ssl=false
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.port=6060
> > -Djava.endorsed.dirs=/var/LucidWorks/lucidworks/tomcat/endorsed
> > -Dcatalina.base=/var/LucidWorks/lucidworks/tomcat
> > -Dcatalina.home=/var/LucidWorks/lucidworks/tomcat
> > -Djava.io.tmpdir=/var/LucidWorks/lucidworks/tomcat/temp ****
> >
> > java_command: org.apache.catalina.startup.Bootstrap -server 
> > -Dsolr.solr.home=lucidworks/solr start****
> >
> > Launcher Type: SUN_STANDARD****
> >
> > ** **
> >
> > Stack: [0x0000000000000000,0x0000000000000000],
> > sp=0x0000000040835eb0, free space=1056983k****
> >
> > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
> > C=native
> > code)****
> >
> > V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> > long)+0x82****
> >
> > V  [libjvm.so+0x3c481a]
> > CMSConcMarkingTask::do_work_steal(int)+0xfa****
> >
> > V  [libjvm.so+0x3c3dcf]  CMSConcMarkingTask::work(int)+0xef****
> >
> > V  [libjvm.so+0x8783dc]  YieldingFlexibleGangWorker::loop()+0xbc****
> >
> > V  [libjvm.so+0x8755b4]  GangWorker::run()+0x24****
> >
> > V  [libjvm.so+0x71096f]  java_start(Thread*)+0x13f****
> >
> > ** **
> >
> > Heap****
> >
> > par new generation   total 345024K, used 180672K [0x00002aaaae120000,
> > 0x00002aaac5780000, 0x00002aaac5780000)****
> >
> >   eden space 306688K,  53% used [0x00002aaaae120000, 
> > 0x00002aaab8243c28,
> > 0x00002aaac0ca0000)****
> >
> >   from space 38336K,  40% used [0x00002aaac3210000, 
> > 0x00002aaac415c3f8,
> > 0x00002aaac5780000)****
> >
> >   to   space 38336K,   0% used [0x00002aaac0ca0000, 0x00002aaac0ca0000,
> > 0x00002aaac3210000)****
> >
> > concurrent mark-sweep generation total 33171072K, used 12144213K 
> > [0x00002aaac5780000, 0x00002ab2ae120000, 0x00002ab2ae120000)****
> >
> > concurrent-mark-sweep perm gen total 83968K, used 50650K 
> > [0x00002ab2ae120000, 0x00002ab2b3320000, 0x00002ab2b3320000)****
> >
> > ** **
> >
> > Code Cache  [0x00002aaaab054000, 0x00002aaaab9a4000,
> > 0x00002aaaae054000)**
> > **
> >
> > total_blobs=2800 nmethods=2273 adapters=480 free_code_cache=40752512
> > largest_free_block=15808****
> >
> > ** **
> >
> > ** **
> >
> > ** **
> >
> > Thanks,****
> >
> > ** **
> >
> > *Robert (Robi) Petersen*
> >
> > Senior Software Engineer****
> >
> > Search Department****
> >
> > ** **
> >
>
>


RE: occasional GC crashes

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Robi,

Oh that's the thing of the past, go for the latest Java 7 if they let you!

Otis
--
Performance Monitoring - http://sematext.com/spm
On Dec 20, 2012 6:29 PM, "Petersen, Robert" <ro...@buy.com> wrote:

> Hi Otis,
>
> I thought Java 7 had a bug which wasn't being addressed by Oracle which
> was making it not suitable for Solr.  Did that get fixed now?
> http://searchhub.org/2011/07/28/dont-use-java-7-for-anything/
>
> I did see this but it doesn't really mention the bug:
> http://opensearchnews.com/2012/04/announcing-java7-support-with-apache-solr-and-lucene/
>
> Thanks
> Robi
>
>
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com]
> Sent: Tuesday, December 18, 2012 5:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: occasional GC crashes
>
> Robert,
>
> Step 1 is to get the latest Java 7 or if you have to remain on 6 then use
> the latest 6.
>
> Otis
> --
> SOLR Performance Monitoring - http://sematext.com/spm On Dec 18, 2012
> 7:54 PM, "Petersen, Robert" <ro...@buy.com> wrote:
>
> >  Hi solr user group,****
> >
> > ** **
> >
> > Sorry if this isn't directly a Solr question.  Seems like once in a
> > blue moon the GC crashes on a server in our Solr 3.6.1 slave farm.
> > This seems to only happen on a couple of the twelve slaves we have
> > deployed and only very rarely on those.  It seems like this doesn't
> > directly affect solr because in the logs it looks like solr keeps
> > working after the time of the exception but our external monitoring
> > tool reports that the solr service is down so our operations department
> restarts solr on that box and alerts me.
> > The solr logs show nothing unusual.  The exception does show up in the
> > catalina.out log file though.  Does this happen to anyone else?  Here is
> > the basic error and I have attached the crash dump file also.   Our total
> > uptime on these boxes is over a year now BTW.****
> >
> > ** **
> >
> > #****
> >
> > # A fatal error has been detected by the Java Runtime Environment:****
> >
> > #****
> >
> > #  SIGSEGV (0xb) at pc=0x00002b5379346612, pid=13724,
> > tid=1082353984****
> >
> > #****
> >
> > # JRE version: 6.0_25-b06****
> >
> > # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode
> > linux-amd64 )****
> >
> > # Problematic frame:****
> >
> > # V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> > long)+0x82****
> >
> > #****
> >
> > # An error report file with more information is saved as:****
> >
> > # /var/LucidWorks/lucidworks/hs_err_pid13724.log****
> >
> > #****
> >
> > # If you would like to submit a bug report, please visit:****
> >
> > #   http://java.sun.com/webapps/bugreport/crash.jsp****
> >
> > #****
> >
> > ** **
> >
> > VM Arguments:****
> >
> > jvm_args:
> > -Djava.util.logging.config.file=/var/LucidWorks/lucidworks/tomcat/conf
> > /logging.properties
> > -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> > -Xmx32768m -Xms32768m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
> > -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.ssl=false
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.port=6060
> > -Djava.endorsed.dirs=/var/LucidWorks/lucidworks/tomcat/endorsed
> > -Dcatalina.base=/var/LucidWorks/lucidworks/tomcat
> > -Dcatalina.home=/var/LucidWorks/lucidworks/tomcat
> > -Djava.io.tmpdir=/var/LucidWorks/lucidworks/tomcat/temp ****
> >
> > java_command: org.apache.catalina.startup.Bootstrap -server
> > -Dsolr.solr.home=lucidworks/solr start****
> >
> > Launcher Type: SUN_STANDARD****
> >
> > ** **
> >
> > Stack: [0x0000000000000000,0x0000000000000000],
> > sp=0x0000000040835eb0, free space=1056983k****
> >
> > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
> > C=native
> > code)****
> >
> > V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> > long)+0x82****
> >
> > V  [libjvm.so+0x3c481a]
> > CMSConcMarkingTask::do_work_steal(int)+0xfa****
> >
> > V  [libjvm.so+0x3c3dcf]  CMSConcMarkingTask::work(int)+0xef****
> >
> > V  [libjvm.so+0x8783dc]  YieldingFlexibleGangWorker::loop()+0xbc****
> >
> > V  [libjvm.so+0x8755b4]  GangWorker::run()+0x24****
> >
> > V  [libjvm.so+0x71096f]  java_start(Thread*)+0x13f****
> >
> > ** **
> >
> > Heap****
> >
> > par new generation   total 345024K, used 180672K [0x00002aaaae120000,
> > 0x00002aaac5780000, 0x00002aaac5780000)****
> >
> >   eden space 306688K,  53% used [0x00002aaaae120000,
> > 0x00002aaab8243c28,
> > 0x00002aaac0ca0000)****
> >
> >   from space 38336K,  40% used [0x00002aaac3210000,
> > 0x00002aaac415c3f8,
> > 0x00002aaac5780000)****
> >
> >   to   space 38336K,   0% used [0x00002aaac0ca0000, 0x00002aaac0ca0000,
> > 0x00002aaac3210000)****
> >
> > concurrent mark-sweep generation total 33171072K, used 12144213K
> > [0x00002aaac5780000, 0x00002ab2ae120000, 0x00002ab2ae120000)****
> >
> > concurrent-mark-sweep perm gen total 83968K, used 50650K
> > [0x00002ab2ae120000, 0x00002ab2b3320000, 0x00002ab2b3320000)****
> >
> > ** **
> >
> > Code Cache  [0x00002aaaab054000, 0x00002aaaab9a4000,
> > 0x00002aaaae054000)**
> > **
> >
> > total_blobs=2800 nmethods=2273 adapters=480 free_code_cache=40752512
> > largest_free_block=15808****
> >
> > ** **
> >
> > ** **
> >
> > ** **
> >
> > Thanks,****
> >
> > ** **
> >
> > *Robert (Robi) Petersen*
> >
> > Senior Software Engineer****
> >
> > Search Department****
> >
> > ** **
> >
>
>

RE: occasional GC crashes

Posted by "Petersen, Robert" <ro...@buy.com>.
Hi Otis,

I thought Java 7 had a bug which wasn't being addressed by Oracle which was making it not suitable for Solr.  Did that get fixed now?
http://searchhub.org/2011/07/28/dont-use-java-7-for-anything/

I did see this but it doesn't really mention the bug:  http://opensearchnews.com/2012/04/announcing-java7-support-with-apache-solr-and-lucene/

Thanks
Robi


-----Original Message-----
From: Otis Gospodnetic [mailto:otis.gospodnetic@gmail.com] 
Sent: Tuesday, December 18, 2012 5:25 PM
To: solr-user@lucene.apache.org
Subject: Re: occasional GC crashes

Robert,

Step 1 is to get the latest Java 7 or if you have to remain on 6 then use the latest 6.

Otis
--
SOLR Performance Monitoring - http://sematext.com/spm On Dec 18, 2012 7:54 PM, "Petersen, Robert" <ro...@buy.com> wrote:

>  Hi solr user group,****
>
> ** **
>
> Sorry if this isn't directly a Solr question.  Seems like once in a 
> blue moon the GC crashes on a server in our Solr 3.6.1 slave farm.  
> This seems to only happen on a couple of the twelve slaves we have 
> deployed and only very rarely on those.  It seems like this doesn't 
> directly affect solr because in the logs it looks like solr keeps 
> working after the time of the exception but our external monitoring 
> tool reports that the solr service is down so our operations department restarts solr on that box and alerts me.
> The solr logs show nothing unusual.  The exception does show up in the 
> catalina.out log file though.  Does this happen to anyone else?  Here is
> the basic error and I have attached the crash dump file also.   Our total
> uptime on these boxes is over a year now BTW.****
>
> ** **
>
> #****
>
> # A fatal error has been detected by the Java Runtime Environment:****
>
> #****
>
> #  SIGSEGV (0xb) at pc=0x00002b5379346612, pid=13724, 
> tid=1082353984****
>
> #****
>
> # JRE version: 6.0_25-b06****
>
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode
> linux-amd64 )****
>
> # Problematic frame:****
>
> # V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> long)+0x82****
>
> #****
>
> # An error report file with more information is saved as:****
>
> # /var/LucidWorks/lucidworks/hs_err_pid13724.log****
>
> #****
>
> # If you would like to submit a bug report, please visit:****
>
> #   http://java.sun.com/webapps/bugreport/crash.jsp****
>
> #****
>
> ** **
>
> VM Arguments:****
>
> jvm_args:
> -Djava.util.logging.config.file=/var/LucidWorks/lucidworks/tomcat/conf
> /logging.properties 
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> -Xmx32768m -Xms32768m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=6060
> -Djava.endorsed.dirs=/var/LucidWorks/lucidworks/tomcat/endorsed
> -Dcatalina.base=/var/LucidWorks/lucidworks/tomcat
> -Dcatalina.home=/var/LucidWorks/lucidworks/tomcat
> -Djava.io.tmpdir=/var/LucidWorks/lucidworks/tomcat/temp ****
>
> java_command: org.apache.catalina.startup.Bootstrap -server 
> -Dsolr.solr.home=lucidworks/solr start****
>
> Launcher Type: SUN_STANDARD****
>
> ** **
>
> Stack: [0x0000000000000000,0x0000000000000000],  
> sp=0x0000000040835eb0, free space=1056983k****
>
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, 
> C=native
> code)****
>
> V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> long)+0x82****
>
> V  [libjvm.so+0x3c481a]  
> CMSConcMarkingTask::do_work_steal(int)+0xfa****
>
> V  [libjvm.so+0x3c3dcf]  CMSConcMarkingTask::work(int)+0xef****
>
> V  [libjvm.so+0x8783dc]  YieldingFlexibleGangWorker::loop()+0xbc****
>
> V  [libjvm.so+0x8755b4]  GangWorker::run()+0x24****
>
> V  [libjvm.so+0x71096f]  java_start(Thread*)+0x13f****
>
> ** **
>
> Heap****
>
> par new generation   total 345024K, used 180672K [0x00002aaaae120000,
> 0x00002aaac5780000, 0x00002aaac5780000)****
>
>   eden space 306688K,  53% used [0x00002aaaae120000, 
> 0x00002aaab8243c28,
> 0x00002aaac0ca0000)****
>
>   from space 38336K,  40% used [0x00002aaac3210000, 
> 0x00002aaac415c3f8,
> 0x00002aaac5780000)****
>
>   to   space 38336K,   0% used [0x00002aaac0ca0000, 0x00002aaac0ca0000,
> 0x00002aaac3210000)****
>
> concurrent mark-sweep generation total 33171072K, used 12144213K 
> [0x00002aaac5780000, 0x00002ab2ae120000, 0x00002ab2ae120000)****
>
> concurrent-mark-sweep perm gen total 83968K, used 50650K 
> [0x00002ab2ae120000, 0x00002ab2b3320000, 0x00002ab2b3320000)****
>
> ** **
>
> Code Cache  [0x00002aaaab054000, 0x00002aaaab9a4000, 
> 0x00002aaaae054000)**
> **
>
> total_blobs=2800 nmethods=2273 adapters=480 free_code_cache=40752512
> largest_free_block=15808****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks,****
>
> ** **
>
> *Robert (Robi) Petersen*
>
> Senior Software Engineer****
>
> Search Department****
>
> ** **
>


Re: occasional GC crashes

Posted by Otis Gospodnetic <ot...@gmail.com>.
Robert,

Step 1 is to get the latest Java 7 or if you have to remain on 6 then use
the latest 6.

Otis
--
SOLR Performance Monitoring - http://sematext.com/spm
On Dec 18, 2012 7:54 PM, "Petersen, Robert" <ro...@buy.com> wrote:

>  Hi solr user group,****
>
> ** **
>
> Sorry if this isn’t directly a Solr question.  Seems like once in a blue
> moon the GC crashes on a server in our Solr 3.6.1 slave farm.  This seems
> to only happen on a couple of the twelve slaves we have deployed and only
> very rarely on those.  It seems like this doesn’t directly affect solr
> because in the logs it looks like solr keeps working after the time of the
> exception but our external monitoring tool reports that the solr service is
> down so our operations department restarts solr on that box and alerts me.
> The solr logs show nothing unusual.  The exception does show up in the
> catalina.out log file though.  Does this happen to anyone else?  Here is
> the basic error and I have attached the crash dump file also.   Our total
> uptime on these boxes is over a year now BTW.****
>
> ** **
>
> #****
>
> # A fatal error has been detected by the Java Runtime Environment:****
>
> #****
>
> #  SIGSEGV (0xb) at pc=0x00002b5379346612, pid=13724, tid=1082353984****
>
> #****
>
> # JRE version: 6.0_25-b06****
>
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode
> linux-amd64 )****
>
> # Problematic frame:****
>
> # V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> long)+0x82****
>
> #****
>
> # An error report file with more information is saved as:****
>
> # /var/LucidWorks/lucidworks/hs_err_pid13724.log****
>
> #****
>
> # If you would like to submit a bug report, please visit:****
>
> #   http://java.sun.com/webapps/bugreport/crash.jsp****
>
> #****
>
> ** **
>
> VM Arguments:****
>
> jvm_args:
> -Djava.util.logging.config.file=/var/LucidWorks/lucidworks/tomcat/conf/logging.properties
> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
> -Xmx32768m -Xms32768m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode
> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.port=6060
> -Djava.endorsed.dirs=/var/LucidWorks/lucidworks/tomcat/endorsed
> -Dcatalina.base=/var/LucidWorks/lucidworks/tomcat
> -Dcatalina.home=/var/LucidWorks/lucidworks/tomcat
> -Djava.io.tmpdir=/var/LucidWorks/lucidworks/tomcat/temp ****
>
> java_command: org.apache.catalina.startup.Bootstrap -server
> -Dsolr.solr.home=lucidworks/solr start****
>
> Launcher Type: SUN_STANDARD****
>
> ** **
>
> Stack: [0x0000000000000000,0x0000000000000000],  sp=0x0000000040835eb0,
> free space=1056983k****
>
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
> code)****
>
> V  [libjvm.so+0x3c4612]  Par_ConcMarkingClosure::trim_queue(unsigned
> long)+0x82****
>
> V  [libjvm.so+0x3c481a]  CMSConcMarkingTask::do_work_steal(int)+0xfa****
>
> V  [libjvm.so+0x3c3dcf]  CMSConcMarkingTask::work(int)+0xef****
>
> V  [libjvm.so+0x8783dc]  YieldingFlexibleGangWorker::loop()+0xbc****
>
> V  [libjvm.so+0x8755b4]  GangWorker::run()+0x24****
>
> V  [libjvm.so+0x71096f]  java_start(Thread*)+0x13f****
>
> ** **
>
> Heap****
>
> par new generation   total 345024K, used 180672K [0x00002aaaae120000,
> 0x00002aaac5780000, 0x00002aaac5780000)****
>
>   eden space 306688K,  53% used [0x00002aaaae120000, 0x00002aaab8243c28,
> 0x00002aaac0ca0000)****
>
>   from space 38336K,  40% used [0x00002aaac3210000, 0x00002aaac415c3f8,
> 0x00002aaac5780000)****
>
>   to   space 38336K,   0% used [0x00002aaac0ca0000, 0x00002aaac0ca0000,
> 0x00002aaac3210000)****
>
> concurrent mark-sweep generation total 33171072K, used 12144213K
> [0x00002aaac5780000, 0x00002ab2ae120000, 0x00002ab2ae120000)****
>
> concurrent-mark-sweep perm gen total 83968K, used 50650K
> [0x00002ab2ae120000, 0x00002ab2b3320000, 0x00002ab2b3320000)****
>
> ** **
>
> Code Cache  [0x00002aaaab054000, 0x00002aaaab9a4000, 0x00002aaaae054000)**
> **
>
> total_blobs=2800 nmethods=2273 adapters=480 free_code_cache=40752512
> largest_free_block=15808****
>
> ** **
>
> ** **
>
> ** **
>
> Thanks,****
>
> ** **
>
> *Robert (Robi) Petersen*
>
> Senior Software Engineer****
>
> Search Department****
>
> ** **
>