You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by YouPeng Yang <yy...@gmail.com> on 2014/09/12 15:36:03 UTC

Fatal full GC

Hi
 We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
360G*3 data(one core with 2 replica).
  Our cluster becomes unstable which means occasionlly it comes out long
time full gc.This is awful,the full gc take long take that the solrcloud
consider it as down.
  Normally full gc happens when the Old Generaion  get 70%,and it is
OK.However In the awfull condition,the percentage is highly above 70% ,and
become  99% so that the long full gc happens,and the node is considered as
down.
   We set he JVM parameters referring to the URL
:*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
<https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only difference
is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M -Xmx81920M* .
  The appendix[1]  is the output of the jstat when the awful full gc
happens.I have marked  the important port with red font hoping to be
helpful.
   By the way,I have notice that Eden part of Young Generation takes 100%
always during the awful condition happens,which I think it is a import
indication.
  The SolrCloud will be used to support our applications as a very
important part.
  Would you please give me any suggestion? Do I need to change the JDK
version?


Any suggestions will be appreciated.

Best Regard


[1]------------------------------------------------------------------------------------------------------
   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
 ..ommit..
 33.27  84.37 100.00  70.14  59.94  28070 4396.141    14    3.724 4399.864
100.00   0.00  30.96  70.38  59.94  28077 4397.877    14    3.724 4401.601
  0.00  72.06   0.00  70.66  59.94  28083 4399.554    14    3.724 4403.277
 59.50  49.30 100.00  70.88  59.94  28091 4401.101    14    3.724 4404.825
 76.98 100.00 100.00  71.07  59.94  28098 4402.707    14    3.724 4406.431
100.00  84.59 100.00  71.41  59.94  28105 4404.526    14    3.724 4408.250
100.00  89.60 100.00  71.77  59.94  28111 4406.216    14    3.724 4409.939
100.00 100.00  99.92  72.16  59.94  28116 4407.609    14    3.724 4411.333
100.00 100.00 100.00  72.68  59.94  28120 4409.041    14    3.724 4412.764
100.00 100.00 100.00  73.02  59.94  28126 4410.666    14    3.724 4414.390
 92.06 100.00 100.00  73.37  59.94  28132 4412.389    14    3.724 4416.113
 68.89 100.00 100.00  73.74  59.94  28138 4414.004    14    3.724 4417.728
100.00 100.00 100.00  73.99  59.94  28144 4415.555    14    3.724 4419.278
100.00  56.44 100.00  74.31  59.94  28151 4417.311    14    3.724 4421.034
 65.78  25.37 100.00  74.57  59.94  28159 4419.051    14    3.724 4422.774
 62.41  43.09 100.00  74.76  59.94  28167 4420.740    14    3.724 4424.464
 36.14  15.59 100.00  74.97  59.94  28175 4422.353    14    3.724 4426.077
 91.86  37.75 100.00  75.09  59.94  28183 4423.976    14    3.724 4427.700
 87.88 100.00 100.00  75.30  59.94  28190 4425.713    14    3.724 4429.437
 88.91 100.00 100.00  75.63  59.94  28196 4427.293    14    3.724 4431.017
100.00 100.00 100.00  76.01  59.94  28202 4428.816    14    3.724 4432.539
  0.00 100.00  97.08  76.28  59.94  28208 4430.504    14    3.724 4434.228
 63.42  45.06 100.00  76.57  59.94  28215 4432.018    14    3.724 4435.742
 52.26  35.19 100.00  76.73  59.94  28223 4433.644    14    3.724 4437.367
100.00   0.00  75.24  76.88  59.94  28230 4435.231    14    3.724 4438.955
100.00 100.00 100.00  77.27  59.94  28235 4436.334    14    3.724 4440.057
 87.09 100.00 100.00  77.63  59.94  28242 4438.118    14    3.724 4441.842
 92.06 100.00 100.00  95.77  59.94  28248 4439.763    14    3.724 4443.487
  0.00 100.00  37.93  78.65  59.94  28253 4441.483    14    3.724 4445.207
 68.38  81.73 100.00  79.04  59.94  28260 4442.971    14    3.724 4446.695
100.00 100.00 100.00  79.24  59.94  28267 4444.706    14    3.724 4448.429
 95.40   0.00   0.00  79.56  59.94  28274 4446.608    14    3.724 4450.332
 53.60   0.00 100.00  79.82  59.94  28283 4448.213    14    3.724 4451.937
100.00  89.81 100.00  80.01  59.94  28291 4449.759    14    3.724 4453.483
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 ..ommit..

Re: Fatal full GC

Posted by YouPeng Yang <yy...@gmail.com>.
Hi
  Thank you very much. we make the change to low down the Heap size ,we are
watching the effect of this change.we will inform you about the result.
  It is really helpful.

Best Regard

2014-09-12 23:00 GMT+08:00 Walter Underwood <wu...@wunderwood.org>:

> I agree about the 80Gb heap as a possible problem.
>
> A GC is essentially a linear scan of memory. More memory means a longer
> scan.
>
> We run with an 8Gb heap. I’d try that. Test it by replaying logs from
> production against a test instance. You can use JMeter and the Apache
> access log sampler.
>
>
> https://jmeter.apache.org/usermanual/jmeter_accesslog_sampler_step_by_step.pdf
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/
>
>
> On Sep 12, 2014, at 7:10 AM, Shawn Heisey <so...@elyograg.org> wrote:
>
> > On 9/12/2014 7:36 AM, YouPeng Yang wrote:
> >> We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster
> contains
> >> 360G*3 data(one core with 2 replica).
> >>  Our cluster becomes unstable which means occasionlly it comes out long
> >> time full gc.This is awful,the full gc take long take that the solrcloud
> >> consider it as down.
> >>  Normally full gc happens when the Old Generaion  get 70%,and it is
> >> OK.However In the awfull condition,the percentage is highly above 70%
> ,and
> >> become  99% so that the long full gc happens,and the node is considered
> as
> >> down.
> >>   We set he JVM parameters referring to the URL
> >> :*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
> >> <https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only
> difference
> >> is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M
> -Xmx81920M* .
> >>  The appendix[1]  is the output of the jstat when the awful full gc
> >> happens.I have marked  the important port with red font hoping to be
> >> helpful.
> >>   By the way,I have notice that Eden part of Young Generation takes 100%
> >> always during the awful condition happens,which I think it is a import
> >> indication.
> >>  The SolrCloud will be used to support our applications as a very
> >> important part.
> >>  Would you please give me any suggestion? Do I need to change the JDK
> >> version?
> >
> > My GC parameter page is getting around. :)
> >
> > Do you really need an 80GB heap?  I realize that your index is 360GB ...
> > but if you really do need a heap that large, you may need to adjust your
> > configuration so you use a lot less heap memory.
> >
> > The red font you mentioned did not make it through, so I cannot tell
> > what lines you highlighted.
> >
> > I pulled your jstat output into a spreadsheet and calculated the length
> > of each GC.  The longest GC in there took 1.903 seconds.  It's the one
> > that had a GCT of 4450.332.  For an 80GB heap, you couldn't hope for
> > anything better.  Based on what I see here, I don't think GC is your
> > problem.  If I read the other numbers on that 1.903 second GC line
> > correctly (not sure that I am), it dropped your Eden size from 100% to
> > 0% ... suggesting that you really don't need an 80GB heap.
> >
> > How much RAM does this machine have?  For ideal performance, you'll need
> > your index size plus your heap size, which for you right now is 440 GB.
> > Normally you don't need the ideal memory size ... but you do need a
> > *significant* portion of it.  I don't think I'd try running this index
> > with less than 256GB of RAM, and that's assuming a much lower heap size
> > than 80GB.
> >
> > Here's some general info about performance problems and possible
> solutions:
> >
> > http://wiki.apache.org/solr/SolrPerformanceProblems
> >
> > Thanks,
> > Shawn
> >
>
>

Re: Fatal full GC

Posted by Walter Underwood <wu...@wunderwood.org>.
I agree about the 80Gb heap as a possible problem.

A GC is essentially a linear scan of memory. More memory means a longer scan.

We run with an 8Gb heap. I’d try that. Test it by replaying logs from production against a test instance. You can use JMeter and the Apache access log sampler.

https://jmeter.apache.org/usermanual/jmeter_accesslog_sampler_step_by_step.pdf

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/


On Sep 12, 2014, at 7:10 AM, Shawn Heisey <so...@elyograg.org> wrote:

> On 9/12/2014 7:36 AM, YouPeng Yang wrote:
>> We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
>> 360G*3 data(one core with 2 replica).
>>  Our cluster becomes unstable which means occasionlly it comes out long
>> time full gc.This is awful,the full gc take long take that the solrcloud
>> consider it as down.
>>  Normally full gc happens when the Old Generaion  get 70%,and it is
>> OK.However In the awfull condition,the percentage is highly above 70% ,and
>> become  99% so that the long full gc happens,and the node is considered as
>> down.
>>   We set he JVM parameters referring to the URL
>> :*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>> <https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only difference
>> is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M -Xmx81920M* .
>>  The appendix[1]  is the output of the jstat when the awful full gc
>> happens.I have marked  the important port with red font hoping to be
>> helpful.
>>   By the way,I have notice that Eden part of Young Generation takes 100%
>> always during the awful condition happens,which I think it is a import
>> indication.
>>  The SolrCloud will be used to support our applications as a very
>> important part.
>>  Would you please give me any suggestion? Do I need to change the JDK
>> version?
> 
> My GC parameter page is getting around. :)
> 
> Do you really need an 80GB heap?  I realize that your index is 360GB ...
> but if you really do need a heap that large, you may need to adjust your
> configuration so you use a lot less heap memory.
> 
> The red font you mentioned did not make it through, so I cannot tell
> what lines you highlighted.
> 
> I pulled your jstat output into a spreadsheet and calculated the length
> of each GC.  The longest GC in there took 1.903 seconds.  It's the one
> that had a GCT of 4450.332.  For an 80GB heap, you couldn't hope for
> anything better.  Based on what I see here, I don't think GC is your
> problem.  If I read the other numbers on that 1.903 second GC line
> correctly (not sure that I am), it dropped your Eden size from 100% to
> 0% ... suggesting that you really don't need an 80GB heap.
> 
> How much RAM does this machine have?  For ideal performance, you'll need
> your index size plus your heap size, which for you right now is 440 GB. 
> Normally you don't need the ideal memory size ... but you do need a
> *significant* portion of it.  I don't think I'd try running this index
> with less than 256GB of RAM, and that's assuming a much lower heap size
> than 80GB.
> 
> Here's some general info about performance problems and possible solutions:
> 
> http://wiki.apache.org/solr/SolrPerformanceProblems
> 
> Thanks,
> Shawn
> 


Re: Fatal full GC

Posted by Shawn Heisey <so...@elyograg.org>.
On 9/12/2014 7:36 AM, YouPeng Yang wrote:
>  We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
> 360G*3 data(one core with 2 replica).
>   Our cluster becomes unstable which means occasionlly it comes out long
> time full gc.This is awful,the full gc take long take that the solrcloud
> consider it as down.
>   Normally full gc happens when the Old Generaion  get 70%,and it is
> OK.However In the awfull condition,the percentage is highly above 70% ,and
> become  99% so that the long full gc happens,and the node is considered as
> down.
>    We set he JVM parameters referring to the URL
> :*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
> <https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only difference
> is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M -Xmx81920M* .
>   The appendix[1]  is the output of the jstat when the awful full gc
> happens.I have marked  the important port with red font hoping to be
> helpful.
>    By the way,I have notice that Eden part of Young Generation takes 100%
> always during the awful condition happens,which I think it is a import
> indication.
>   The SolrCloud will be used to support our applications as a very
> important part.
>   Would you please give me any suggestion? Do I need to change the JDK
> version?

My GC parameter page is getting around. :)

Do you really need an 80GB heap?  I realize that your index is 360GB ...
but if you really do need a heap that large, you may need to adjust your
configuration so you use a lot less heap memory.

The red font you mentioned did not make it through, so I cannot tell
what lines you highlighted.

I pulled your jstat output into a spreadsheet and calculated the length
of each GC.  The longest GC in there took 1.903 seconds.  It's the one
that had a GCT of 4450.332.  For an 80GB heap, you couldn't hope for
anything better.  Based on what I see here, I don't think GC is your
problem.  If I read the other numbers on that 1.903 second GC line
correctly (not sure that I am), it dropped your Eden size from 100% to
0% ... suggesting that you really don't need an 80GB heap.

How much RAM does this machine have?  For ideal performance, you'll need
your index size plus your heap size, which for you right now is 440 GB. 
Normally you don't need the ideal memory size ... but you do need a
*significant* portion of it.  I don't think I'd try running this index
with less than 256GB of RAM, and that's assuming a much lower heap size
than 80GB.

Here's some general info about performance problems and possible solutions:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn


Fatal full GC

Posted by YouPeng Yang <yy...@gmail.com>.
Hi
 We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains
360G*3 data(one core with 2 replica).
  Our cluster becomes unstable which means occasionlly it comes out long
time full gc.This is awful,the full gc take long take that the solrcloud
consider it as down.
  Normally full gc happens when the Old Generaion  get 70%,and it is
OK.However In the awfull condition,the percentage is highly above 70% ,and
become  99% so that the long full gc happens,and the node is considered as
down.
   We set he JVM parameters referring to the URL
:*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
<https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only difference
is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M -Xmx81920M* .
  The appendix[1]  is the output of the jstat when the awful full gc
happens.I have marked  the important port with red font hoping to be
helpful.
   By the way,I have notice that Eden part of Young Generation takes 100%
always during the awful condition happens,which I think it is a import
indication.
  The SolrCloud will be used to support our applications as a very
important part.
  Would you please give me any suggestion? Do I need to change the JDK
version?


Any suggestions will be appreciated.

Best Regard


[1]------------------------------------------------------------------------------------------------------
   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT
 ..ommit..
 33.27  84.37 100.00  70.14  59.94  28070 4396.141    14    3.724 4399.864
100.00   0.00  30.96  70.38  59.94  28077 4397.877    14    3.724 4401.601
  0.00  72.06   0.00  70.66  59.94  28083 4399.554    14    3.724 4403.277
 59.50  49.30 100.00  70.88  59.94  28091 4401.101    14    3.724 4404.825
 76.98 100.00 100.00  71.07  59.94  28098 4402.707    14    3.724 4406.431
100.00  84.59 100.00  71.41  59.94  28105 4404.526    14    3.724 4408.250
100.00  89.60 100.00  71.77  59.94  28111 4406.216    14    3.724 4409.939
100.00 100.00  99.92  72.16  59.94  28116 4407.609    14    3.724 4411.333
100.00 100.00 100.00  72.68  59.94  28120 4409.041    14    3.724 4412.764
100.00 100.00 100.00  73.02  59.94  28126 4410.666    14    3.724 4414.390
 92.06 100.00 100.00  73.37  59.94  28132 4412.389    14    3.724 4416.113
 68.89 100.00 100.00  73.74  59.94  28138 4414.004    14    3.724 4417.728
100.00 100.00 100.00  73.99  59.94  28144 4415.555    14    3.724 4419.278
100.00  56.44 100.00  74.31  59.94  28151 4417.311    14    3.724 4421.034
 65.78  25.37 100.00  74.57  59.94  28159 4419.051    14    3.724 4422.774
 62.41  43.09 100.00  74.76  59.94  28167 4420.740    14    3.724 4424.464
 36.14  15.59 100.00  74.97  59.94  28175 4422.353    14    3.724 4426.077
 91.86  37.75 100.00  75.09  59.94  28183 4423.976    14    3.724 4427.700
 87.88 100.00 100.00  75.30  59.94  28190 4425.713    14    3.724 4429.437
 88.91 100.00 100.00  75.63  59.94  28196 4427.293    14    3.724 4431.017
100.00 100.00 100.00  76.01  59.94  28202 4428.816    14    3.724 4432.539
  0.00 100.00  97.08  76.28  59.94  28208 4430.504    14    3.724 4434.228
 63.42  45.06 100.00  76.57  59.94  28215 4432.018    14    3.724 4435.742
 52.26  35.19 100.00  76.73  59.94  28223 4433.644    14    3.724 4437.367
100.00   0.00  75.24  76.88  59.94  28230 4435.231    14    3.724 4438.955
100.00 100.00 100.00  77.27  59.94  28235 4436.334    14    3.724 4440.057
 87.09 100.00 100.00  77.63  59.94  28242 4438.118    14    3.724 4441.842
 92.06 100.00 100.00  95.77  59.94  28248 4439.763    14    3.724 4443.487
  0.00 100.00  37.93  78.65  59.94  28253 4441.483    14    3.724 4445.207
 68.38  81.73 100.00  79.04  59.94  28260 4442.971    14    3.724 4446.695
100.00 100.00 100.00  79.24  59.94  28267 4444.706    14    3.724 4448.429
 95.40   0.00   0.00  79.56  59.94  28274 4446.608    14    3.724 4450.332
 53.60   0.00 100.00  79.82  59.94  28283 4448.213    14    3.724 4451.937
100.00  89.81 100.00  80.01  59.94  28291 4449.759    14    3.724 4453.483
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 88.21 100.00 100.00  80.38  59.94  28298 4451.466    14    3.724 4455.190
 ..ommit..

Re: Fatal full GC

Posted by rulinma <ru...@gmail.com>.
mark



--
View this message in context: http://lucene.472066.n3.nabble.com/Fatal-full-GC-tp4158429p4159827.html
Sent from the Solr - User mailing list archive at Nabble.com.