You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Nick Vasilyev <ni...@gmail.com> on 2016/04/28 17:29:33 UTC

Solr 5.2.1 on Java 8 GC

Hello,

We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing long GC
pauses when running jobs that do some hairy faceting. The same jobs worked
fine with our previous 4.6 Solr.

The JVM is configured with 32GB heap with default GC settings, however I've
been tweaking the GC settings to no avail. The latest version had the
following differences from the default config:

XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7

XX:CMSInitiatingOccupancyFraction increased from 50 to 70


Here is a sample output from the gc_log

2016-04-28T04:36:47.240-0400: 27905.535: Total time for which application
threads were stopped: 0.1667520 seconds, Stopping threads took: 0.0171900
seconds
{Heap before GC invocations=2051 (full 59):
 par new generation   total 6990528K, used 2626705K [0x00002b16c0000000,
0x00002b18c0000000, 0x00002b18c0000000)
  eden space 5592448K,  44% used [0x00002b16c0000000, 0x00002b17571b9948,
0x00002b1815560000)
  from space 1398080K,  10% used [0x00002b1815560000, 0x00002b181e8cac28,
0x00002b186aab0000)
  to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
0x00002b18c0000000)
 concurrent mark-sweep generation total 25165824K, used 25122205K
[0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
 Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
43008K
2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
2016-04-28T04:36:49.828-0400: 27908.124: [CMS2016-04-28T04:36:49.912-0400:
27908.207: [CMS-concurr
ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
real=5.86 secs]
 (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
secs] 27748910K->15103706K(32156352K), [Metaspace: 41840K->41840K(43008K)],
8.5657830 secs] [
Times: user=8.56 sys=0.01, real=8.57 secs]
Heap after GC invocations=2052 (full 60):
 par new generation   total 6990528K, used 0K [0x00002b16c0000000,
0x00002b18c0000000, 0x00002b18c0000000)
  eden space 5592448K,   0% used [0x00002b16c0000000, 0x00002b16c0000000,
0x00002b1815560000)
  from space 1398080K,   0% used [0x00002b1815560000, 0x00002b1815560000,
0x00002b186aab0000)
  to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
0x00002b18c0000000)
 concurrent mark-sweep generation total 25165824K, used 15103706K
[0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
 Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
43008K
}
2016-04-28T04:36:58.395-0400: 27916.690: Total time for which application
threads were stopped: 8.5676090 seconds, Stopping threads took: 0.0003930
seconds

I read the instructions here, https://wiki.apache.org/solr/ShawnHeisey, but
they seem to be specific to Java 7. Are there any updated recommendations
for Java 8?

Re: Solr 5.2.1 on Java 8 GC

Posted by Shawn Heisey <ap...@elyograg.org>.

On 4/28/2016 9:43 AM, Nick Vasilyev wrote:
> I forgot to mention that the index is approximately 50 million docs split
> across 4 shards (replication factor 2) on 2 solr replicas.

Later in the thread, Jeff Wartes mentioned my wiki page for GC tuning.

https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr

Oracle seems to be putting most of their recent work into the G1
collector.  They want to make it the default collector for Java 9.  I
don't know if this is going to happen, or it they will delay that change
until Java 10 ... but I think that the default *is* going to eventually
be G1.

Lucene strongly recommends against EVER using G1, but I have not noticed
any problems with it on Solr.  I have not seen any specific *reason*
that G1 is not recommended, at least with a 64-bit JVM.  I can only
remember seeing issues in Jira with G1 when the JVM was 32-bit -- which
has its own limitations, so I don't recommend it anyway.  If somebody
can point me to specific *OPEN* issues showing current problems with G1
on a 64-bit JVM, I will have an easier time believing the Lucene
assertion that it's a bad idea.  I have searched Jira and can't find
anything relevant.

The best GC results I've seen in testing Solr have been with the G1
collector.  I haven't done any testing for quite a while, and almost all
of the testing that I've done has been with 4.x versions, not 5.x.

Out of the box, Solr 5.0 and later uses GC tuning with the CMS collector
that looks a lot like the CMS config that I came up with.  This works
pretty well, but if the heap size gets big enough, especially 32GB or
larger, I suspect that it will start to have problems with GC performance.

You could give the settings I listed under "Current experiments" a try. 
You would do this by editing your solr.in.* script to comment out the
current GC tuning parameters and substituting the new set of
parameters.  This is a G1 config, and as I already mentioned, Lucene
recommends NEVER using G1.

For your specific situation, I think you should try setting the max heap
to 31GB instead of 32GB, so the pointer sizes are cut in half, without
making a huge difference in the total amount of memory available.

The chewiebug version of GCViewer is what I use to take the GC logfile
and see how Solr did with particular GC settings.  At my request, they
have incorporated some really nice statistical metrics into GCViewer,
but it's not available in the released versions -- you'll have to
compile it yourself until 1.35 comes out.

https://github.com/chewiebug/GCViewer/issues/139

I have also had some good luck with jHiccup.  That tool will inform you
of pauses that happen for *any* reason, not just because of GC.  My
experience is that GC is the only *major* cause of pauses.

Thanks,
Shawn

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

How do you log GC frequency and time to compare it with other GC
configurations?

Also, do you tweak parameters automatically or is there a set of
configuration that get tested?

Lastly, I was under impression that G1 is not recommended to be used based
on some issues with Lucene, so I haven't tried it. Are you guys seeing any
significant performance benefits with it and Java 8? Any issues?
On May 1, 2016 12:57 PM, "Bram Van Dam" <br...@intix.eu> wrote:

> On 30/04/16 17:34, Davis, Daniel (NIH/NLM) [C] wrote:
> > Bram, on the subject of brute force - if your script is "clever" and
> uses binary first search, I'd love to adapt it to my environment.  I am
> trying to build a truly multi-tenant Solr because each of our indexes is
> tiny, but all together they will eventually be big, and so I'll have to
> repeat this experiment, many, many times.
>
> Sorry to disappoint, the script is very dumb, and it doesn't just
> start/stop Solr, it installs our application suite, picks a GC profile
> at random, indexes a boatload of data and then runs a bunch of query tests.
>
> Three pointers I can give you:
>
> 1) beware of JVM versions, especially when using the G1 collector, it
> behaves horribly on older JVMs but rather nicely on newer versions.
>
> 2) At the very least you'll want to test the G1 and CMS collectors.
>
> 3) One large index vs many small indexes: the behaviour is very
> different. Depending on how many indexes you have, it might be worth to
> run each one in a different JVM. Of course that's not practical if you
> have thousands of indexes.
>
>  - Bram
>
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Bram Van Dam <br...@intix.eu>.

On 30/04/16 17:34, Davis, Daniel (NIH/NLM) [C] wrote:
> Bram, on the subject of brute force - if your script is "clever" and uses binary first search, I'd love to adapt it to my environment.  I am trying to build a truly multi-tenant Solr because each of our indexes is tiny, but all together they will eventually be big, and so I'll have to repeat this experiment, many, many times.

Sorry to disappoint, the script is very dumb, and it doesn't just
start/stop Solr, it installs our application suite, picks a GC profile
at random, indexes a boatload of data and then runs a bunch of query tests.

Three pointers I can give you:

1) beware of JVM versions, especially when using the G1 collector, it
behaves horribly on older JVMs but rather nicely on newer versions.

2) At the very least you'll want to test the G1 and CMS collectors.

3) One large index vs many small indexes: the behaviour is very
different. Depending on how many indexes you have, it might be worth to
run each one in a different JVM. Of course that's not practical if you
have thousands of indexes.

 - Bram

RE: Solr 5.2.1 on Java 8 GC

Posted by "Davis, Daniel (NIH/NLM) [C]" <da...@nih.gov>.

Bram, on the subject of brute force - if your script is "clever" and uses binary first search, I'd love to adapt it to my environment.  I am trying to build a truly multi-tenant Solr because each of our indexes is tiny, but all together they will eventually be big, and so I'll have to repeat this experiment, many, many times.

________________________________________
From: Bram Van Dam [bram.vandam@intix.eu]
Sent: Saturday, April 30, 2016 7:10 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 5.2.1 on Java 8 GC

On 29/04/16 16:40, Nick Vasilyev wrote:
> Not sure if it helps anyone, but I am seeing decent results with the
> following.
>
> It was mostly a result of trial and error,

I'm ashamed to admit that I've used a similar approach: wrote a simple
test script to try out various GC settings with various values. Repeat
ad nauseum. Ended with a configuration that works reasonably well on the
environment in question, but will probably fail horribly anywhere else.

When in doubt, use brute force.

 - Bram

Re: Solr 5.2.1 on Java 8 GC

Posted by Bram Van Dam <br...@intix.eu>.

On 29/04/16 16:40, Nick Vasilyev wrote:
> Not sure if it helps anyone, but I am seeing decent results with the
> following.
> 
> It was mostly a result of trial and error, 

I'm ashamed to admit that I've used a similar approach: wrote a simple
test script to try out various GC settings with various values. Repeat
ad nauseum. Ended with a configuration that works reasonably well on the
environment in question, but will probably fail horribly anywhere else.

When in doubt, use brute force.

 - Bram

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

Not sure if it helps anyone, but I am seeing decent results with the
following.

It was mostly a result of trial and error, I am not familiar with Java GC
or even Java itself. I added my interpretation of what was happening, but I
am not sure if it is right, take it for what it's worth. It'd be nice if
someone could provide a better technical explanation. We are about to hit
deaily peak load and so far it doesn't look like there is any negative
performance impact.

-XX:NewRatio=2 \ #Increases the size of the young generation
-XX:SurvivorRatio=3 \ #Increases the size of the survivor spaces
-XX:TargetSurvivorRatio=90 \
-XX:MaxTenuringThreshold=8 \
-XX:+UseConcMarkSweepGC \
-XX:+CMSScavengeBeforeRemark \
-XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \ #Increasing these didn't
really help
-XX:PretenureSizeThreshold=512m \ # I am not sure what the full impact of
this yet, but I am assuming it will put less stuff in the eden space
-XX:CMSFullGCsBeforeCompaction=1 \
-XX:+UseCMSInitiatingOccupancyOnly \
-XX:CMSInitiatingOccupancyFraction=70 \
-XX:CMSMaxAbortablePrecleanTime=6000 \
-XX:+CMSParallelRemarkEnabled
-XX:+ParallelRefProcEnabled
-XX:+UseLargePages \
-XX:+AggressiveOpts \

Here are the GC times over 1 second.

Before:
2016-04-26T04:30:53.175-0400: 244734.856: Total time for which application
threads were stopped: 12.6587130 seconds, Stopping threads took: 0.0024770
seconds
2016-04-26T04:31:22.808-0400: 244764.489: Total time for which application
threads were stopped: 10.6840330 seconds, Stopping threads took: 0.0004840
seconds
2016-04-26T04:31:48.586-0400: 244790.267: Total time for which application
threads were stopped: 10.8198760 seconds, Stopping threads took: 0.0010340
seconds
2016-04-26T04:32:10.095-0400: 244811.777: Total time for which application
threads were stopped: 9.5644690 seconds, Stopping threads took: 0.0006750
seconds
2016-04-26T04:32:32.600-0400: 244834.282: Total time for which application
threads were stopped: 10.0890420 seconds, Stopping threads took: 0.0009930
seconds
2016-04-26T04:32:55.747-0400: 244857.429: Total time for which application
threads were stopped: 10.3426480 seconds, Stopping threads took: 0.0008190
seconds
2016-04-26T04:33:20.522-0400: 244882.203: Total time for which application
threads were stopped: 10.7531070 seconds, Stopping threads took: 0.0013280
seconds
2016-04-26T04:33:45.853-0400: 244907.535: Total time for which application
threads were stopped: 10.3933700 seconds, Stopping threads took: 0.0013970
seconds
2016-04-26T04:34:15.634-0400: 244937.316: Total time for which application
threads were stopped: 10.5744420 seconds, Stopping threads took: 0.0008980
seconds
2016-04-26T04:34:53.802-0400: 244975.484: Total time for which application
threads were stopped: 10.4964470 seconds, Stopping threads took: 0.0013830
seconds
2016-04-26T04:35:19.276-0400: 245000.957: Total time for which application
threads were stopped: 9.8195470 seconds, Stopping threads took: 0.0016110
seconds
2016-04-26T04:35:43.617-0400: 245025.299: Total time for which application
threads were stopped: 9.4856600 seconds, Stopping threads took: 0.0014980
seconds
2016-04-26T04:36:06.540-0400: 245048.222: Total time for which application
threads were stopped: 9.5009880 seconds, Stopping threads took: 0.0009080
seconds
2016-04-26T04:36:32.843-0400: 245074.525: Total time for which application
threads were stopped: 9.6370000 seconds, Stopping threads took: 0.0011770
seconds
2016-04-26T04:36:57.114-0400: 245098.795: Total time for which application
threads were stopped: 10.0064990 seconds, Stopping threads took: 0.0011480
seconds
2016-04-26T04:37:21.074-0400: 245122.755: Total time for which application
threads were stopped: 9.7061140 seconds, Stopping threads took: 0.0009760
seconds
2016-04-26T04:37:45.716-0400: 245147.398: Total time for which application
threads were stopped: 9.9133330 seconds, Stopping threads took: 0.0008220
seconds
2016-04-26T04:38:11.412-0400: 245173.094: Total time for which application
threads were stopped: 10.6839560 seconds, Stopping threads took: 0.0015370
seconds
2016-04-26T04:38:37.177-0400: 245198.859: Total time for which application
threads were stopped: 10.0646910 seconds, Stopping threads took: 0.0013740
seconds
2016-04-26T04:39:00.516-0400: 245222.197: Total time for which application
threads were stopped: 9.8280250 seconds, Stopping threads took: 0.0008900
seconds
2016-04-26T04:39:25.255-0400: 245246.937: Total time for which application
threads were stopped: 10.8429080 seconds, Stopping threads took: 0.0007120
seconds
2016-04-26T04:41:06.937-0400: 245348.619: Total time for which application
threads were stopped: 9.8060420 seconds, Stopping threads took: 0.0006300
seconds
2016-04-26T04:41:43.370-0400: 245385.052: Total time for which application
threads were stopped: 10.8144800 seconds, Stopping threads took: 0.0002260
seconds
2016-04-26T04:42:09.479-0400: 245411.161: Total time for which application
threads were stopped: 9.4059640 seconds, Stopping threads took: 0.0001340
seconds
2016-04-26T04:42:36.033-0400: 245437.715: Total time for which application
threads were stopped: 9.9446430 seconds, Stopping threads took: 0.0007500
seconds
2016-04-26T04:43:02.409-0400: 245464.091: Total time for which application
threads were stopped: 10.4197000 seconds, Stopping threads took: 0.0000260
seconds
2016-04-26T04:43:29.559-0400: 245491.241: Total time for which application
threads were stopped: 9.6712880 seconds, Stopping threads took: 0.0001080
seconds
2016-04-26T04:43:56.648-0400: 245518.330: Total time for which application
threads were stopped: 9.8339590 seconds, Stopping threads took: 0.0011820
seconds
2016-04-26T04:45:35.358-0400: 245617.040: Total time for which application
threads were stopped: 9.5853210 seconds, Stopping threads took: 0.0001760
seconds
2016-04-26T04:54:58.764-0400: 246180.446: Total time for which application
threads were stopped: 2.9048350 seconds, Stopping threads took: 0.0008180
seconds
2016-04-26T04:55:06.107-0400: 246187.789: Total time for which application
threads were stopped: 1.1189760 seconds, Stopping threads took: 0.0011390
seconds

After:
2016-04-29T04:30:05.758-0400: 29962.077: Total time for which application
threads were stopped: 1.0823960 seconds, Stopping threads took: 0.0005840
seconds
2016-04-29T04:30:11.349-0400: 29967.668: Total time for which application
threads were stopped: 1.4147830 seconds, Stopping threads took: 0.0008980
seconds
2016-04-29T04:30:17.198-0400: 29973.517: Total time for which application
threads were stopped: 1.6294590 seconds, Stopping threads took: 0.0009380
seconds
2016-04-29T04:30:22.350-0400: 29978.669: Total time for which application
threads were stopped: 1.6787880 seconds, Stopping threads took: 0.0012320
seconds
2016-04-29T04:30:28.230-0400: 29984.549: Total time for which application
threads were stopped: 1.6895760 seconds, Stopping threads took: 0.0010270
seconds
2016-04-29T04:30:29.944-0400: 29986.263: Total time for which application
threads were stopped: 1.5271500 seconds, Stopping threads took: 0.0009670
seconds
2016-04-29T04:30:35.282-0400: 29991.601: Total time for which application
threads were stopped: 1.6575670 seconds, Stopping threads took: 0.0006200
seconds
2016-04-29T04:30:51.011-0400: 30007.329: Total time for which application
threads were stopped: 2.0383550 seconds, Stopping threads took: 0.0004640
seconds
2016-04-29T04:31:03.032-0400: 30019.351: Total time for which application
threads were stopped: 2.1963570 seconds, Stopping threads took: 0.0004650
seconds
2016-04-29T04:31:07.679-0400: 30023.998: Total time for which application
threads were stopped: 1.2220760 seconds, Stopping threads took: 0.0004720
seconds

On Thu, Apr 28, 2016 at 1:02 PM, Jeff Wartes <jw...@whitepages.com> wrote:

>
> Shawn Heisey’s page is the usual reference guide for GC settings:
> https://wiki.apache.org/solr/ShawnHeisey
> Most of the learnings from that are in the Solr 5.x startup scripts
> already, but your heap is bigger, so your mileage may vary.
>
> Some tools I’ve used while doing GC tuning:
>
> * VisualVM - Comes with the jdk. It has a Visual GC plug-in that’s pretty
> nice for visualizing what’s going on in realtime, but you need to connect
> it via jstatd for that to work.
> * GCViewer - Visualizes a GC log. The UI leaves a lot to be desired, but
> it’s the best tool I’ve found for this purpose. Use this fork for jdk 6+ -
> https://github.com/chewiebug/GCViewer
> * Swiss Java Knife has a bunch of useful features -
> https://github.com/aragozin/jvm-tools
> * YourKit - I’ve been using this lately to analyze where garbage comes
> from. It’s not free though.
> * Eclipse Memory Analyzer - I used this to analyze heap dumps before I got
> a YourKit license: http://www.eclipse.org/mat/
>
> Good luck!
>
>
>
>
>
>
> On 4/28/16, 9:27 AM, "Yonik Seeley" <ys...@gmail.com> wrote:
>
> >On Thu, Apr 28, 2016 at 12:21 PM, Nick Vasilyev
> ><ni...@gmail.com> wrote:
> >> Hi Yonik,
> >>
> >> There are a lot of logistics involved with re-indexing and naturally
> >> upgrading Solr. I was hoping that there is an easier alternative since
> this
> >> is only a single back end script that is having problems.
> >>
> >> Is there any room for improvement with tweaking GC params?
> >
> >There always is ;-)  But I'm not a GC tuning expert.  I prefer to
> >attack memory problems more head-on (i.e. with code to use less
> >memory).
> >
> >-Yonik
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Jeff Wartes <jw...@whitepages.com>.

Shawn Heisey’s page is the usual reference guide for GC settings: https://wiki.apache.org/solr/ShawnHeisey
Most of the learnings from that are in the Solr 5.x startup scripts already, but your heap is bigger, so your mileage may vary.

Some tools I’ve used while doing GC tuning:

* VisualVM - Comes with the jdk. It has a Visual GC plug-in that’s pretty nice for visualizing what’s going on in realtime, but you need to connect it via jstatd for that to work.
* GCViewer - Visualizes a GC log. The UI leaves a lot to be desired, but it’s the best tool I’ve found for this purpose. Use this fork for jdk 6+ - https://github.com/chewiebug/GCViewer
* Swiss Java Knife has a bunch of useful features - https://github.com/aragozin/jvm-tools
* YourKit - I’ve been using this lately to analyze where garbage comes from. It’s not free though. 
* Eclipse Memory Analyzer - I used this to analyze heap dumps before I got a YourKit license: http://www.eclipse.org/mat/

Good luck!

On 4/28/16, 9:27 AM, "Yonik Seeley" <ys...@gmail.com> wrote:

>On Thu, Apr 28, 2016 at 12:21 PM, Nick Vasilyev
><ni...@gmail.com> wrote:
>> Hi Yonik,
>>
>> There are a lot of logistics involved with re-indexing and naturally
>> upgrading Solr. I was hoping that there is an easier alternative since this
>> is only a single back end script that is having problems.
>>
>> Is there any room for improvement with tweaking GC params?
>
>There always is ;-)  But I'm not a GC tuning expert.  I prefer to
>attack memory problems more head-on (i.e. with code to use less
>memory).
>
>-Yonik

Re: Solr 5.2.1 on Java 8 GC

Posted by Yonik Seeley <ys...@gmail.com>.

On Thu, Apr 28, 2016 at 12:21 PM, Nick Vasilyev
<ni...@gmail.com> wrote:
> Hi Yonik,
>
> There are a lot of logistics involved with re-indexing and naturally
> upgrading Solr. I was hoping that there is an easier alternative since this
> is only a single back end script that is having problems.
>
> Is there any room for improvement with tweaking GC params?

There always is ;-)  But I'm not a GC tuning expert.  I prefer to
attack memory problems more head-on (i.e. with code to use less
memory).

-Yonik

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

Hi Yonik,

There are a lot of logistics involved with re-indexing and naturally
upgrading Solr. I was hoping that there is an easier alternative since this
is only a single back end script that is having problems.

Is there any room for improvement with tweaking GC params?

On Thu, Apr 28, 2016 at 12:06 PM, Yonik Seeley <ys...@gmail.com> wrote:

> On Thu, Apr 28, 2016 at 11:50 AM, Nick Vasilyev
> <ni...@gmail.com> wrote:
> > mmfr_exact is a string field. key_phrases is a multivalued string field.
>
> One guess is that top-level field caches (and UnInvertedField use)
> were removed in
> https://issues.apache.org/jira/browse/LUCENE-5666
>
> While this is better for NRT (a quickly changing index), it's worse in
> CPU, and can be worse in memory overhead for very static indexes.
>
> Multi-valued string faceting was hit hardest:
> https://issues.apache.org/jira/browse/SOLR-8096
> Although I only measured the CPU impact, and not memory.
>
> The 4.x method of faceting was restored as part of
> https://issues.apache.org/jira/browse/SOLR-8466
>
> If this is the issue, you can:
> - try reindexing with docValues... that should solve memory issues at
> the expense of some speed
> - upgrade to a more recent Solr version and use facet.method=uif for
> your multi-valued string fields
>
> -Yonik
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Yonik Seeley <ys...@gmail.com>.

On Thu, Apr 28, 2016 at 11:50 AM, Nick Vasilyev
<ni...@gmail.com> wrote:
> mmfr_exact is a string field. key_phrases is a multivalued string field.

One guess is that top-level field caches (and UnInvertedField use)
were removed in
https://issues.apache.org/jira/browse/LUCENE-5666

While this is better for NRT (a quickly changing index), it's worse in
CPU, and can be worse in memory overhead for very static indexes.

Multi-valued string faceting was hit hardest:
https://issues.apache.org/jira/browse/SOLR-8096
Although I only measured the CPU impact, and not memory.

The 4.x method of faceting was restored as part of
https://issues.apache.org/jira/browse/SOLR-8466

If this is the issue, you can:
- try reindexing with docValues... that should solve memory issues at
the expense of some speed
- upgrade to a more recent Solr version and use facet.method=uif for
your multi-valued string fields

-Yonik

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

Correction, the key_phrases is set up as follows:

<field name="key_phrases" type="key_phrases" indexed="true"
multiValued="true" omitNorms="false" omitPositions="true"
omitTermFreqAndPositions="true" stored="false" termVectors="false"/>


   <fieldType class="solr.TextField" name="key_phrases" omitNorms="true"
sortMissingLast="true">
    <analyzer>
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.ShingleFilterFactory" maxShingleSize="4"
minShingleSize="2" outputUnigramsIfNoShingles="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.PatternReplaceFilterFactory"
pattern="^(\p{Punct}*)(.*?)(\p{Punct}*)$" replacement="$2"/>
      <filter class="solr.TrimFilterFactory"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
    </analyzer>
  </fieldType>

On Thu, Apr 28, 2016 at 12:03 PM, Nick Vasilyev <ni...@gmail.com>
wrote:

> The working set is larger than the heap. This is our largest collection
> and all shards combined would probably be around 60GB in total, there are
> also a few other much smaller collections.
>
> During normal operations the JVM memory utilization hangs between 17GB and
> 22GB if we aren't indexing any data.
>
> Either way, this wasn't a problem before. I suspect that it is because we
> are now on Java 8 so I wanted to reach out to the community to see if there
> are any new best practices around GC tuning since the current
> recommendation seems to be for Java 7.
>
>
> On Thu, Apr 28, 2016 at 11:54 AM, Walter Underwood <wu...@wunderwood.org>
> wrote:
>
>> 32 GB is a pretty big heap. If the working set is really smaller than
>> that, the extra heap just makes a full GC take longer.
>>
>> How much heap is used after a full GC? Take the largest value you see
>> there, then add a bit more, maybe 25% more or 2 GB more.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> > On Apr 28, 2016, at 8:50 AM, Nick Vasilyev <ni...@gmail.com>
>> wrote:
>> >
>> > mmfr_exact is a string field. key_phrases is a multivalued string field.
>> >
>> > On Thu, Apr 28, 2016 at 11:47 AM, Yonik Seeley <ys...@gmail.com>
>> wrote:
>> >
>> >> What about the field types though... are they single valued or multi
>> >> valued, string, text, numeric?
>> >>
>> >> -Yonik
>> >>
>> >>
>> >> On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
>> >> <ni...@gmail.com> wrote:
>> >>> Hi Yonik,
>> >>>
>> >>> I forgot to mention that the index is approximately 50 million docs
>> split
>> >>> across 4 shards (replication factor 2) on 2 solr replicas.
>> >>>
>> >>> This particular script will filter items based on a category
>> >> (10-~1,000,000
>> >>> items in each) and run facets on top X terms for particular fields.
>> Query
>> >>> looks like this:
>> >>>
>> >>> {
>> >>>   q => "cat:$code",
>> >>>   rows => 0,
>> >>>   facet => 'true',
>> >>>   'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
>> >>>   'f.key_phrases.facet.limit' => 100,
>> >>>   'f.mmfr_exact.facet.limit' => 20,
>> >>>   'facet.mincount' => 5,
>> >>>   distrib => 'false',
>> >>> }
>> >>>
>> >>> I know it can be re-worked some, especially considering there are
>> >> thousands
>> >>> of similar requests going out. However we didn't have this issue
>> before
>> >> and
>> >>> I am worried that it may be a symptom of a larger underlying problem.
>> >>>
>> >>> On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com>
>> >> wrote:
>> >>>
>> >>>> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
>> >>>> <ni...@gmail.com> wrote:
>> >>>>> Hello,
>> >>>>>
>> >>>>> We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing
>> >> long
>> >>>> GC
>> >>>>> pauses when running jobs that do some hairy faceting. The same jobs
>> >>>> worked
>> >>>>> fine with our previous 4.6 Solr.
>> >>>>
>> >>>> What does a typical request look like, and what are the field types
>> >>>> that faceting is done on?
>> >>>>
>> >>>> -Yonik
>> >>>>
>> >>>>
>> >>>>> The JVM is configured with 32GB heap with default GC settings,
>> however
>> >>>> I've
>> >>>>> been tweaking the GC settings to no avail. The latest version had
>> the
>> >>>>> following differences from the default config:
>> >>>>>
>> >>>>> XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
>> >>>>>
>> >>>>> XX:CMSInitiatingOccupancyFraction increased from 50 to 70
>> >>>>>
>> >>>>>
>> >>>>> Here is a sample output from the gc_log
>> >>>>>
>> >>>>> 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which
>> >> application
>> >>>>> threads were stopped: 0.1667520 seconds, Stopping threads took:
>> >> 0.0171900
>> >>>>> seconds
>> >>>>> {Heap before GC invocations=2051 (full 59):
>> >>>>> par new generation   total 6990528K, used 2626705K
>> >> [0x00002b16c0000000,
>> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
>> >>>>>  eden space 5592448K,  44% used [0x00002b16c0000000,
>> >> 0x00002b17571b9948,
>> >>>>> 0x00002b1815560000)
>> >>>>>  from space 1398080K,  10% used [0x00002b1815560000,
>> >> 0x00002b181e8cac28,
>> >>>>> 0x00002b186aab0000)
>> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
>> >> 0x00002b186aab0000,
>> >>>>> 0x00002b18c0000000)
>> >>>>> concurrent mark-sweep generation total 25165824K, used 25122205K
>> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
>> >> reserved
>> >>>>> 43008K
>> >>>>> 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
>> >>>>> 2016-04-28T04:36:49.828-0400: 27908.124:
>> >>>> [CMS2016-04-28T04:36:49.912-0400:
>> >>>>> 27908.207: [CMS-concurr
>> >>>>> ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70
>> sys=2.77,
>> >>>>> real=5.86 secs]
>> >>>>> (concurrent mode failure): 25122205K->15103706K(25165824K),
>> 8.5567560
>> >>>>> secs] 27748910K->15103706K(32156352K), [Metaspace:
>> >>>> 41840K->41840K(43008K)],
>> >>>>> 8.5657830 secs] [
>> >>>>> Times: user=8.56 sys=0.01, real=8.57 secs]
>> >>>>> Heap after GC invocations=2052 (full 60):
>> >>>>> par new generation   total 6990528K, used 0K [0x00002b16c0000000,
>> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
>> >>>>>  eden space 5592448K,   0% used [0x00002b16c0000000,
>> >> 0x00002b16c0000000,
>> >>>>> 0x00002b1815560000)
>> >>>>>  from space 1398080K,   0% used [0x00002b1815560000,
>> >> 0x00002b1815560000,
>> >>>>> 0x00002b186aab0000)
>> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
>> >> 0x00002b186aab0000,
>> >>>>> 0x00002b18c0000000)
>> >>>>> concurrent mark-sweep generation total 25165824K, used 15103706K
>> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
>> >> reserved
>> >>>>> 43008K
>> >>>>> }
>> >>>>> 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which
>> >> application
>> >>>>> threads were stopped: 8.5676090 seconds, Stopping threads took:
>> >> 0.0003930
>> >>>>> seconds
>> >>>>>
>> >>>>> I read the instructions here,
>> >> https://wiki.apache.org/solr/ShawnHeisey,
>> >>>> but
>> >>>>> they seem to be specific to Java 7. Are there any updated
>> >> recommendations
>> >>>>> for Java 8?
>> >>>>
>> >>
>>
>>
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

The working set is larger than the heap. This is our largest collection and
all shards combined would probably be around 60GB in total, there are also
a few other much smaller collections.

During normal operations the JVM memory utilization hangs between 17GB and
22GB if we aren't indexing any data.

Either way, this wasn't a problem before. I suspect that it is because we
are now on Java 8 so I wanted to reach out to the community to see if there
are any new best practices around GC tuning since the current
recommendation seems to be for Java 7.


On Thu, Apr 28, 2016 at 11:54 AM, Walter Underwood <wu...@wunderwood.org>
wrote:

> 32 GB is a pretty big heap. If the working set is really smaller than
> that, the extra heap just makes a full GC take longer.
>
> How much heap is used after a full GC? Take the largest value you see
> there, then add a bit more, maybe 25% more or 2 GB more.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Apr 28, 2016, at 8:50 AM, Nick Vasilyev <ni...@gmail.com>
> wrote:
> >
> > mmfr_exact is a string field. key_phrases is a multivalued string field.
> >
> > On Thu, Apr 28, 2016 at 11:47 AM, Yonik Seeley <ys...@gmail.com>
> wrote:
> >
> >> What about the field types though... are they single valued or multi
> >> valued, string, text, numeric?
> >>
> >> -Yonik
> >>
> >>
> >> On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
> >> <ni...@gmail.com> wrote:
> >>> Hi Yonik,
> >>>
> >>> I forgot to mention that the index is approximately 50 million docs
> split
> >>> across 4 shards (replication factor 2) on 2 solr replicas.
> >>>
> >>> This particular script will filter items based on a category
> >> (10-~1,000,000
> >>> items in each) and run facets on top X terms for particular fields.
> Query
> >>> looks like this:
> >>>
> >>> {
> >>>   q => "cat:$code",
> >>>   rows => 0,
> >>>   facet => 'true',
> >>>   'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
> >>>   'f.key_phrases.facet.limit' => 100,
> >>>   'f.mmfr_exact.facet.limit' => 20,
> >>>   'facet.mincount' => 5,
> >>>   distrib => 'false',
> >>> }
> >>>
> >>> I know it can be re-worked some, especially considering there are
> >> thousands
> >>> of similar requests going out. However we didn't have this issue before
> >> and
> >>> I am worried that it may be a symptom of a larger underlying problem.
> >>>
> >>> On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com>
> >> wrote:
> >>>
> >>>> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
> >>>> <ni...@gmail.com> wrote:
> >>>>> Hello,
> >>>>>
> >>>>> We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing
> >> long
> >>>> GC
> >>>>> pauses when running jobs that do some hairy faceting. The same jobs
> >>>> worked
> >>>>> fine with our previous 4.6 Solr.
> >>>>
> >>>> What does a typical request look like, and what are the field types
> >>>> that faceting is done on?
> >>>>
> >>>> -Yonik
> >>>>
> >>>>
> >>>>> The JVM is configured with 32GB heap with default GC settings,
> however
> >>>> I've
> >>>>> been tweaking the GC settings to no avail. The latest version had the
> >>>>> following differences from the default config:
> >>>>>
> >>>>> XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
> >>>>>
> >>>>> XX:CMSInitiatingOccupancyFraction increased from 50 to 70
> >>>>>
> >>>>>
> >>>>> Here is a sample output from the gc_log
> >>>>>
> >>>>> 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which
> >> application
> >>>>> threads were stopped: 0.1667520 seconds, Stopping threads took:
> >> 0.0171900
> >>>>> seconds
> >>>>> {Heap before GC invocations=2051 (full 59):
> >>>>> par new generation   total 6990528K, used 2626705K
> >> [0x00002b16c0000000,
> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
> >>>>>  eden space 5592448K,  44% used [0x00002b16c0000000,
> >> 0x00002b17571b9948,
> >>>>> 0x00002b1815560000)
> >>>>>  from space 1398080K,  10% used [0x00002b1815560000,
> >> 0x00002b181e8cac28,
> >>>>> 0x00002b186aab0000)
> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
> >> 0x00002b186aab0000,
> >>>>> 0x00002b18c0000000)
> >>>>> concurrent mark-sweep generation total 25165824K, used 25122205K
> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
> >> reserved
> >>>>> 43008K
> >>>>> 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
> >>>>> 2016-04-28T04:36:49.828-0400: 27908.124:
> >>>> [CMS2016-04-28T04:36:49.912-0400:
> >>>>> 27908.207: [CMS-concurr
> >>>>> ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70
> sys=2.77,
> >>>>> real=5.86 secs]
> >>>>> (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
> >>>>> secs] 27748910K->15103706K(32156352K), [Metaspace:
> >>>> 41840K->41840K(43008K)],
> >>>>> 8.5657830 secs] [
> >>>>> Times: user=8.56 sys=0.01, real=8.57 secs]
> >>>>> Heap after GC invocations=2052 (full 60):
> >>>>> par new generation   total 6990528K, used 0K [0x00002b16c0000000,
> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
> >>>>>  eden space 5592448K,   0% used [0x00002b16c0000000,
> >> 0x00002b16c0000000,
> >>>>> 0x00002b1815560000)
> >>>>>  from space 1398080K,   0% used [0x00002b1815560000,
> >> 0x00002b1815560000,
> >>>>> 0x00002b186aab0000)
> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
> >> 0x00002b186aab0000,
> >>>>> 0x00002b18c0000000)
> >>>>> concurrent mark-sweep generation total 25165824K, used 15103706K
> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
> >> reserved
> >>>>> 43008K
> >>>>> }
> >>>>> 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which
> >> application
> >>>>> threads were stopped: 8.5676090 seconds, Stopping threads took:
> >> 0.0003930
> >>>>> seconds
> >>>>>
> >>>>> I read the instructions here,
> >> https://wiki.apache.org/solr/ShawnHeisey,
> >>>> but
> >>>>> they seem to be specific to Java 7. Are there any updated
> >> recommendations
> >>>>> for Java 8?
> >>>>
> >>
>
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Walter Underwood <wu...@wunderwood.org>.

32 GB is a pretty big heap. If the working set is really smaller than that, the extra heap just makes a full GC take longer.

How much heap is used after a full GC? Take the largest value you see there, then add a bit more, maybe 25% more or 2 GB more.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Apr 28, 2016, at 8:50 AM, Nick Vasilyev <ni...@gmail.com> wrote:
> 
> mmfr_exact is a string field. key_phrases is a multivalued string field.
> 
> On Thu, Apr 28, 2016 at 11:47 AM, Yonik Seeley <ys...@gmail.com> wrote:
> 
>> What about the field types though... are they single valued or multi
>> valued, string, text, numeric?
>> 
>> -Yonik
>> 
>> 
>> On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
>> <ni...@gmail.com> wrote:
>>> Hi Yonik,
>>> 
>>> I forgot to mention that the index is approximately 50 million docs split
>>> across 4 shards (replication factor 2) on 2 solr replicas.
>>> 
>>> This particular script will filter items based on a category
>> (10-~1,000,000
>>> items in each) and run facets on top X terms for particular fields. Query
>>> looks like this:
>>> 
>>> {
>>>   q => "cat:$code",
>>>   rows => 0,
>>>   facet => 'true',
>>>   'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
>>>   'f.key_phrases.facet.limit' => 100,
>>>   'f.mmfr_exact.facet.limit' => 20,
>>>   'facet.mincount' => 5,
>>>   distrib => 'false',
>>> }
>>> 
>>> I know it can be re-worked some, especially considering there are
>> thousands
>>> of similar requests going out. However we didn't have this issue before
>> and
>>> I am worried that it may be a symptom of a larger underlying problem.
>>> 
>>> On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com>
>> wrote:
>>> 
>>>> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
>>>> <ni...@gmail.com> wrote:
>>>>> Hello,
>>>>> 
>>>>> We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing
>> long
>>>> GC
>>>>> pauses when running jobs that do some hairy faceting. The same jobs
>>>> worked
>>>>> fine with our previous 4.6 Solr.
>>>> 
>>>> What does a typical request look like, and what are the field types
>>>> that faceting is done on?
>>>> 
>>>> -Yonik
>>>> 
>>>> 
>>>>> The JVM is configured with 32GB heap with default GC settings, however
>>>> I've
>>>>> been tweaking the GC settings to no avail. The latest version had the
>>>>> following differences from the default config:
>>>>> 
>>>>> XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
>>>>> 
>>>>> XX:CMSInitiatingOccupancyFraction increased from 50 to 70
>>>>> 
>>>>> 
>>>>> Here is a sample output from the gc_log
>>>>> 
>>>>> 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which
>> application
>>>>> threads were stopped: 0.1667520 seconds, Stopping threads took:
>> 0.0171900
>>>>> seconds
>>>>> {Heap before GC invocations=2051 (full 59):
>>>>> par new generation   total 6990528K, used 2626705K
>> [0x00002b16c0000000,
>>>>> 0x00002b18c0000000, 0x00002b18c0000000)
>>>>>  eden space 5592448K,  44% used [0x00002b16c0000000,
>> 0x00002b17571b9948,
>>>>> 0x00002b1815560000)
>>>>>  from space 1398080K,  10% used [0x00002b1815560000,
>> 0x00002b181e8cac28,
>>>>> 0x00002b186aab0000)
>>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
>> 0x00002b186aab0000,
>>>>> 0x00002b18c0000000)
>>>>> concurrent mark-sweep generation total 25165824K, used 25122205K
>>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
>> reserved
>>>>> 43008K
>>>>> 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
>>>>> 2016-04-28T04:36:49.828-0400: 27908.124:
>>>> [CMS2016-04-28T04:36:49.912-0400:
>>>>> 27908.207: [CMS-concurr
>>>>> ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
>>>>> real=5.86 secs]
>>>>> (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
>>>>> secs] 27748910K->15103706K(32156352K), [Metaspace:
>>>> 41840K->41840K(43008K)],
>>>>> 8.5657830 secs] [
>>>>> Times: user=8.56 sys=0.01, real=8.57 secs]
>>>>> Heap after GC invocations=2052 (full 60):
>>>>> par new generation   total 6990528K, used 0K [0x00002b16c0000000,
>>>>> 0x00002b18c0000000, 0x00002b18c0000000)
>>>>>  eden space 5592448K,   0% used [0x00002b16c0000000,
>> 0x00002b16c0000000,
>>>>> 0x00002b1815560000)
>>>>>  from space 1398080K,   0% used [0x00002b1815560000,
>> 0x00002b1815560000,
>>>>> 0x00002b186aab0000)
>>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
>> 0x00002b186aab0000,
>>>>> 0x00002b18c0000000)
>>>>> concurrent mark-sweep generation total 25165824K, used 15103706K
>>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
>> reserved
>>>>> 43008K
>>>>> }
>>>>> 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which
>> application
>>>>> threads were stopped: 8.5676090 seconds, Stopping threads took:
>> 0.0003930
>>>>> seconds
>>>>> 
>>>>> I read the instructions here,
>> https://wiki.apache.org/solr/ShawnHeisey,
>>>> but
>>>>> they seem to be specific to Java 7. Are there any updated
>> recommendations
>>>>> for Java 8?
>>>> 
>>

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

mmfr_exact is a string field. key_phrases is a multivalued string field.

On Thu, Apr 28, 2016 at 11:47 AM, Yonik Seeley <ys...@gmail.com> wrote:

> What about the field types though... are they single valued or multi
> valued, string, text, numeric?
>
> -Yonik
>
>
> On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
> <ni...@gmail.com> wrote:
> > Hi Yonik,
> >
> > I forgot to mention that the index is approximately 50 million docs split
> > across 4 shards (replication factor 2) on 2 solr replicas.
> >
> > This particular script will filter items based on a category
> (10-~1,000,000
> > items in each) and run facets on top X terms for particular fields. Query
> > looks like this:
> >
> > {
> >    q => "cat:$code",
> >    rows => 0,
> >    facet => 'true',
> >    'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
> >    'f.key_phrases.facet.limit' => 100,
> >    'f.mmfr_exact.facet.limit' => 20,
> >    'facet.mincount' => 5,
> >    distrib => 'false',
> >  }
> >
> > I know it can be re-worked some, especially considering there are
> thousands
> > of similar requests going out. However we didn't have this issue before
> and
> > I am worried that it may be a symptom of a larger underlying problem.
> >
> > On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com>
> wrote:
> >
> >> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
> >> <ni...@gmail.com> wrote:
> >> > Hello,
> >> >
> >> > We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing
> long
> >> GC
> >> > pauses when running jobs that do some hairy faceting. The same jobs
> >> worked
> >> > fine with our previous 4.6 Solr.
> >>
> >> What does a typical request look like, and what are the field types
> >> that faceting is done on?
> >>
> >> -Yonik
> >>
> >>
> >> > The JVM is configured with 32GB heap with default GC settings, however
> >> I've
> >> > been tweaking the GC settings to no avail. The latest version had the
> >> > following differences from the default config:
> >> >
> >> > XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
> >> >
> >> > XX:CMSInitiatingOccupancyFraction increased from 50 to 70
> >> >
> >> >
> >> > Here is a sample output from the gc_log
> >> >
> >> > 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which
> application
> >> > threads were stopped: 0.1667520 seconds, Stopping threads took:
> 0.0171900
> >> > seconds
> >> > {Heap before GC invocations=2051 (full 59):
> >> >  par new generation   total 6990528K, used 2626705K
> [0x00002b16c0000000,
> >> > 0x00002b18c0000000, 0x00002b18c0000000)
> >> >   eden space 5592448K,  44% used [0x00002b16c0000000,
> 0x00002b17571b9948,
> >> > 0x00002b1815560000)
> >> >   from space 1398080K,  10% used [0x00002b1815560000,
> 0x00002b181e8cac28,
> >> > 0x00002b186aab0000)
> >> >   to   space 1398080K,   0% used [0x00002b186aab0000,
> 0x00002b186aab0000,
> >> > 0x00002b18c0000000)
> >> >  concurrent mark-sweep generation total 25165824K, used 25122205K
> >> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >> >  Metaspace       used 41840K, capacity 42284K, committed 42680K,
> reserved
> >> > 43008K
> >> > 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
> >> > 2016-04-28T04:36:49.828-0400: 27908.124:
> >> [CMS2016-04-28T04:36:49.912-0400:
> >> > 27908.207: [CMS-concurr
> >> > ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
> >> > real=5.86 secs]
> >> >  (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
> >> > secs] 27748910K->15103706K(32156352K), [Metaspace:
> >> 41840K->41840K(43008K)],
> >> > 8.5657830 secs] [
> >> > Times: user=8.56 sys=0.01, real=8.57 secs]
> >> > Heap after GC invocations=2052 (full 60):
> >> >  par new generation   total 6990528K, used 0K [0x00002b16c0000000,
> >> > 0x00002b18c0000000, 0x00002b18c0000000)
> >> >   eden space 5592448K,   0% used [0x00002b16c0000000,
> 0x00002b16c0000000,
> >> > 0x00002b1815560000)
> >> >   from space 1398080K,   0% used [0x00002b1815560000,
> 0x00002b1815560000,
> >> > 0x00002b186aab0000)
> >> >   to   space 1398080K,   0% used [0x00002b186aab0000,
> 0x00002b186aab0000,
> >> > 0x00002b18c0000000)
> >> >  concurrent mark-sweep generation total 25165824K, used 15103706K
> >> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >> >  Metaspace       used 41840K, capacity 42284K, committed 42680K,
> reserved
> >> > 43008K
> >> > }
> >> > 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which
> application
> >> > threads were stopped: 8.5676090 seconds, Stopping threads took:
> 0.0003930
> >> > seconds
> >> >
> >> > I read the instructions here,
> https://wiki.apache.org/solr/ShawnHeisey,
> >> but
> >> > they seem to be specific to Java 7. Are there any updated
> recommendations
> >> > for Java 8?
> >>
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Yonik Seeley <ys...@gmail.com>.

What about the field types though... are they single valued or multi
valued, string, text, numeric?

-Yonik


On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
<ni...@gmail.com> wrote:
> Hi Yonik,
>
> I forgot to mention that the index is approximately 50 million docs split
> across 4 shards (replication factor 2) on 2 solr replicas.
>
> This particular script will filter items based on a category (10-~1,000,000
> items in each) and run facets on top X terms for particular fields. Query
> looks like this:
>
> {
>    q => "cat:$code",
>    rows => 0,
>    facet => 'true',
>    'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
>    'f.key_phrases.facet.limit' => 100,
>    'f.mmfr_exact.facet.limit' => 20,
>    'facet.mincount' => 5,
>    distrib => 'false',
>  }
>
> I know it can be re-worked some, especially considering there are thousands
> of similar requests going out. However we didn't have this issue before and
> I am worried that it may be a symptom of a larger underlying problem.
>
> On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com> wrote:
>
>> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
>> <ni...@gmail.com> wrote:
>> > Hello,
>> >
>> > We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing long
>> GC
>> > pauses when running jobs that do some hairy faceting. The same jobs
>> worked
>> > fine with our previous 4.6 Solr.
>>
>> What does a typical request look like, and what are the field types
>> that faceting is done on?
>>
>> -Yonik
>>
>>
>> > The JVM is configured with 32GB heap with default GC settings, however
>> I've
>> > been tweaking the GC settings to no avail. The latest version had the
>> > following differences from the default config:
>> >
>> > XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
>> >
>> > XX:CMSInitiatingOccupancyFraction increased from 50 to 70
>> >
>> >
>> > Here is a sample output from the gc_log
>> >
>> > 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which application
>> > threads were stopped: 0.1667520 seconds, Stopping threads took: 0.0171900
>> > seconds
>> > {Heap before GC invocations=2051 (full 59):
>> >  par new generation   total 6990528K, used 2626705K [0x00002b16c0000000,
>> > 0x00002b18c0000000, 0x00002b18c0000000)
>> >   eden space 5592448K,  44% used [0x00002b16c0000000, 0x00002b17571b9948,
>> > 0x00002b1815560000)
>> >   from space 1398080K,  10% used [0x00002b1815560000, 0x00002b181e8cac28,
>> > 0x00002b186aab0000)
>> >   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
>> > 0x00002b18c0000000)
>> >  concurrent mark-sweep generation total 25165824K, used 25122205K
>> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>> >  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
>> > 43008K
>> > 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
>> > 2016-04-28T04:36:49.828-0400: 27908.124:
>> [CMS2016-04-28T04:36:49.912-0400:
>> > 27908.207: [CMS-concurr
>> > ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
>> > real=5.86 secs]
>> >  (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
>> > secs] 27748910K->15103706K(32156352K), [Metaspace:
>> 41840K->41840K(43008K)],
>> > 8.5657830 secs] [
>> > Times: user=8.56 sys=0.01, real=8.57 secs]
>> > Heap after GC invocations=2052 (full 60):
>> >  par new generation   total 6990528K, used 0K [0x00002b16c0000000,
>> > 0x00002b18c0000000, 0x00002b18c0000000)
>> >   eden space 5592448K,   0% used [0x00002b16c0000000, 0x00002b16c0000000,
>> > 0x00002b1815560000)
>> >   from space 1398080K,   0% used [0x00002b1815560000, 0x00002b1815560000,
>> > 0x00002b186aab0000)
>> >   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
>> > 0x00002b18c0000000)
>> >  concurrent mark-sweep generation total 25165824K, used 15103706K
>> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>> >  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
>> > 43008K
>> > }
>> > 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which application
>> > threads were stopped: 8.5676090 seconds, Stopping threads took: 0.0003930
>> > seconds
>> >
>> > I read the instructions here, https://wiki.apache.org/solr/ShawnHeisey,
>> but
>> > they seem to be specific to Java 7. Are there any updated recommendations
>> > for Java 8?
>>

Re: Solr 5.2.1 on Java 8 GC

Posted by Nick Vasilyev <ni...@gmail.com>.

Hi Yonik,

I forgot to mention that the index is approximately 50 million docs split
across 4 shards (replication factor 2) on 2 solr replicas.

This particular script will filter items based on a category (10-~1,000,000
items in each) and run facets on top X terms for particular fields. Query
looks like this:

{
   q => "cat:$code",
   rows => 0,
   facet => 'true',
   'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
   'f.key_phrases.facet.limit' => 100,
   'f.mmfr_exact.facet.limit' => 20,
   'facet.mincount' => 5,
   distrib => 'false',
 }

I know it can be re-worked some, especially considering there are thousands
of similar requests going out. However we didn't have this issue before and
I am worried that it may be a symptom of a larger underlying problem.

On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ys...@gmail.com> wrote:

> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
> <ni...@gmail.com> wrote:
> > Hello,
> >
> > We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing long
> GC
> > pauses when running jobs that do some hairy faceting. The same jobs
> worked
> > fine with our previous 4.6 Solr.
>
> What does a typical request look like, and what are the field types
> that faceting is done on?
>
> -Yonik
>
>
> > The JVM is configured with 32GB heap with default GC settings, however
> I've
> > been tweaking the GC settings to no avail. The latest version had the
> > following differences from the default config:
> >
> > XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
> >
> > XX:CMSInitiatingOccupancyFraction increased from 50 to 70
> >
> >
> > Here is a sample output from the gc_log
> >
> > 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which application
> > threads were stopped: 0.1667520 seconds, Stopping threads took: 0.0171900
> > seconds
> > {Heap before GC invocations=2051 (full 59):
> >  par new generation   total 6990528K, used 2626705K [0x00002b16c0000000,
> > 0x00002b18c0000000, 0x00002b18c0000000)
> >   eden space 5592448K,  44% used [0x00002b16c0000000, 0x00002b17571b9948,
> > 0x00002b1815560000)
> >   from space 1398080K,  10% used [0x00002b1815560000, 0x00002b181e8cac28,
> > 0x00002b186aab0000)
> >   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
> > 0x00002b18c0000000)
> >  concurrent mark-sweep generation total 25165824K, used 25122205K
> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
> > 43008K
> > 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
> > 2016-04-28T04:36:49.828-0400: 27908.124:
> [CMS2016-04-28T04:36:49.912-0400:
> > 27908.207: [CMS-concurr
> > ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
> > real=5.86 secs]
> >  (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
> > secs] 27748910K->15103706K(32156352K), [Metaspace:
> 41840K->41840K(43008K)],
> > 8.5657830 secs] [
> > Times: user=8.56 sys=0.01, real=8.57 secs]
> > Heap after GC invocations=2052 (full 60):
> >  par new generation   total 6990528K, used 0K [0x00002b16c0000000,
> > 0x00002b18c0000000, 0x00002b18c0000000)
> >   eden space 5592448K,   0% used [0x00002b16c0000000, 0x00002b16c0000000,
> > 0x00002b1815560000)
> >   from space 1398080K,   0% used [0x00002b1815560000, 0x00002b1815560000,
> > 0x00002b186aab0000)
> >   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
> > 0x00002b18c0000000)
> >  concurrent mark-sweep generation total 25165824K, used 15103706K
> > [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
> > 43008K
> > }
> > 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which application
> > threads were stopped: 8.5676090 seconds, Stopping threads took: 0.0003930
> > seconds
> >
> > I read the instructions here, https://wiki.apache.org/solr/ShawnHeisey,
> but
> > they seem to be specific to Java 7. Are there any updated recommendations
> > for Java 8?
>

Re: Solr 5.2.1 on Java 8 GC

Posted by Yonik Seeley <ys...@gmail.com>.

On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
<ni...@gmail.com> wrote:
> Hello,
>
> We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing long GC
> pauses when running jobs that do some hairy faceting. The same jobs worked
> fine with our previous 4.6 Solr.

What does a typical request look like, and what are the field types
that faceting is done on?

-Yonik


> The JVM is configured with 32GB heap with default GC settings, however I've
> been tweaking the GC settings to no avail. The latest version had the
> following differences from the default config:
>
> XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
>
> XX:CMSInitiatingOccupancyFraction increased from 50 to 70
>
>
> Here is a sample output from the gc_log
>
> 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which application
> threads were stopped: 0.1667520 seconds, Stopping threads took: 0.0171900
> seconds
> {Heap before GC invocations=2051 (full 59):
>  par new generation   total 6990528K, used 2626705K [0x00002b16c0000000,
> 0x00002b18c0000000, 0x00002b18c0000000)
>   eden space 5592448K,  44% used [0x00002b16c0000000, 0x00002b17571b9948,
> 0x00002b1815560000)
>   from space 1398080K,  10% used [0x00002b1815560000, 0x00002b181e8cac28,
> 0x00002b186aab0000)
>   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
> 0x00002b18c0000000)
>  concurrent mark-sweep generation total 25165824K, used 25122205K
> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
> 43008K
> 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
> 2016-04-28T04:36:49.828-0400: 27908.124: [CMS2016-04-28T04:36:49.912-0400:
> 27908.207: [CMS-concurr
> ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70 sys=2.77,
> real=5.86 secs]
>  (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
> secs] 27748910K->15103706K(32156352K), [Metaspace: 41840K->41840K(43008K)],
> 8.5657830 secs] [
> Times: user=8.56 sys=0.01, real=8.57 secs]
> Heap after GC invocations=2052 (full 60):
>  par new generation   total 6990528K, used 0K [0x00002b16c0000000,
> 0x00002b18c0000000, 0x00002b18c0000000)
>   eden space 5592448K,   0% used [0x00002b16c0000000, 0x00002b16c0000000,
> 0x00002b1815560000)
>   from space 1398080K,   0% used [0x00002b1815560000, 0x00002b1815560000,
> 0x00002b186aab0000)
>   to   space 1398080K,   0% used [0x00002b186aab0000, 0x00002b186aab0000,
> 0x00002b18c0000000)
>  concurrent mark-sweep generation total 25165824K, used 15103706K
> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
>  Metaspace       used 41840K, capacity 42284K, committed 42680K, reserved
> 43008K
> }
> 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which application
> threads were stopped: 8.5676090 seconds, Stopping threads took: 0.0003930
> seconds
>
> I read the instructions here, https://wiki.apache.org/solr/ShawnHeisey, but
> they seem to be specific to Java 7. Are there any updated recommendations
> for Java 8?