You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Russell Bahr <ru...@manzama.com> on 2019/10/14 19:36:15 UTC

solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Hello,
I am sorry in advance as this will be a lengthy email as I will try to provide proper details.
We currently have 2 solr cloud deployments and we are hoping to upgrade to solr 8.x from these but are running into severe performance problems with solr 8.1.1.  I am hoping for some guidance in troubleshooting and overcoming this problem.

Current setup

Backend email processing.
Used for predefined queries that produce email results for our clients.  Approximately 35000 emails distributed over different times of the day for our clients based on their preferences.
solr-spec 4.10.4
lucene-spec 4.10.4
Runtime Oracle Corporation OpenJDK 64-Bit Server VM (1.8.0_222 25.222-b10)
1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 days worth of documents) - indexing new documents regularly throughout the day, deleting aged out documents nightly.

Frontend for website.
Used for customer searches, sometimes runs same query as is defined for email processing.
solr-spec 6.5.1
lucene-spec 6.5.1
Runtime Oracle Corporation OpenJDK 64-Bit Server VM 1.8.0_222 25.222-b10
1 collection 6 shards 3 replicas per shard 50,821,086 current documents (213 days (7months) worth of documents) - indexing new documents regularly throughout the day, deleting aged out documents nightly.

Backend replacement of solr4 and hopefully Frontend replacement as well.
solr-spec 8.1.1
lucene-spec 8.1.1
Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 days worth of documents) - indexing new documents regularly throughout the day, deleting aged out documents nightly.

We are trying to solve a couple of issues with this upgrade of solr version.

1. Using 2 different solr clouds with different version causes different results to come back for our clients in their email and when they search on the front end.
2. When previous person attempted to build out solr 6.5.1 for backend it would crash in the middle of running through the search that creates the content for our client emails.
3. Want to bring both backend and frontend up to current Solr version, and if possible run both of of a single solr cloud instead of 2 with same content indexed to them.

Problem 1

When I run the backend process in a test with all 35000 email queries dumped into a queue on the current solr 4 cloud deployment it takes approximately 7-8 hours to complete. (This is minimum performance target for new solr cloud deployment)
When I run the same backend process in a test with all 35000 email queries dumped into a queue on the new solr 8 cloud deployment it takes greater than 24 hours to complete. (Must be less than 8 hours in order for email deliveries to be timely for content)

Problem 2 (likely same core issue as Problem 1, but much easier to work with)

When I run one of our normal queries against solr 6 cloud deployment the results return in less than 1/2 second.
When I run the same queries against solr 8 cloud deployment the results return in more than 16 seconds.

Link to dropbox folder containing ( https://www.dropbox.com/sh/2x2k5c9db7d4pt9/AADnHwuJc7a9Fh4KmUD15rS0a?dl=0 )

"one of our normal queries"

Solr 6 query results
Solr 8 query results

Solr 4 solrconfig.xml
Solr 4 schema.xml
Solr 4 solr.in.sh

Solr 6 solrconfig.xml
Solr 6 schema.xml
Solr 6 solr.in.sh

Solr 8 solrconfig.xml
Solr 8 schema.xml
Solr 8 solr.in.sh

Thank you in advance for any guidance and advice that you can give me,

Russell Bahr
Lead Infrastructure Engineer

Manzama
a MODERN GOVERNANCE company


Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Russell,

I've noticed few differences between solr8 schema and solr6. Few omitNorms
params missing and few solr.FlattenGraphFilterFactory missing too.
But perhaps the most important difference between the 6 and 8 is the memory
configuration.

solr 6 has

SOLR_HEAP="27158m"
SOLR_JAVA_MEM="-Xms27158m -Xmx27158m"

solr 8 has

SOLR_HEAP="8025m"
SOLR_JAVA_MEM="-Xms8025m -Xmx8025m"

Depending from the size of your index this can be an huge difference.

Note that, as far as I remember, SOLR_HEAP overrides the SOLR_JAVA_MEM
configuration.
I suggest to specify only one of them.

Best regards,
Vincenzo

On Thu, Oct 24, 2019 at 6:41 PM Russell Bahr <ru...@manzama.com> wrote:

> Hi Shawn,
> Still hoping for some feedback here.  Should I not be replying to this
> thread and instead create a new one?  As I do not see an improvement when
> using java11 I am now going to rebuild again with java8 and solr 8.1.1.
> Please respond and let me know if I am going in the right direction, or
> should be attacking this in a different way.
> Thank you,
> Russ
>
> *Manzama*a MODERN GOVERNANCE company
>
> Russell Bahr
> Lead Infrastructure Engineer
>
> USA & CAN Office: +1 (541) 306 3271
> USA & CAN Support: +1 (541) 706 9393
> UK Office & Support: +44 (0)203 282 1633
> AUS Office & Support: +61 (0) 2 8417 2339
>
> 543 NW York Drive, Suite 100, Bend, OR 97703
>
> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> <https://twitter.com/ManzamaInc> | Facebook
> <http://www.facebook.com/manzamainc> | YouTube
> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>
>
> On Tue, Oct 22, 2019 at 1:18 PM Russell Bahr <ru...@manzama.com> wrote:
>
> > Hi,
> > Is there anyone that would be able to assist with the issue that I am
> > seeing?
> > I am seeing the same slowness with solr 8.1.1 using java11 as I am seeing
> > with java12, over queries that are run from solr4.10.4 with java8 and
> > solr6.5.1 with java8.
> > Queries that return in less than half a second on solr4 are taking up to
> > 20 seconds with same data indexed to solr 8.1.1
> > I have posted configs, schemas, and various log files in this shared
> > dropbox folder (
> >
> https://www.dropbox.com/sh/2x2k5c9db7d4pt9/AADnHwuJc7a9Fh4KmUD15rS0a?dl=0
> > ).
> > Any additional help/assistance would be greatly appreciated.
> > Thank you,
> > Russ
> >
> > *Manzama*a MODERN GOVERNANCE company
> >
> > Russell Bahr
> > Lead Infrastructure Engineer
> >
> > USA & CAN Office: +1 (541) 306 3271
> > USA & CAN Support: +1 (541) 706 9393
> > UK Office & Support: +44 (0)203 282 1633
> > AUS Office & Support: +61 (0) 2 8417 2339
> >
> > 543 NW York Drive, Suite 100, Bend, OR 97703
> >
> > LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> > <https://twitter.com/ManzamaInc> | Facebook
> > <http://www.facebook.com/manzamainc> | YouTube
> > <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
> >
> >
> > On Sun, Oct 20, 2019 at 8:54 PM Russell Bahr <ru...@manzama.com> wrote:
> >
> >> Hi Shawn,
> >> per your comments from before
> >>   On Oct 15, 2019, 2:28 AM, Shawn Heisey wrote:
> >>   > Java 12 is not recommended.  It is one of the "new feature" releases
> >>   > that only gets 6 months of support.  We would recommend Java 8 or
> Java
> >>   > 11.  These are the versions with long term support.  Probably a good
> >>   > thing to be using OpenJDK, as the official Oracle Java now requires
> >>   > paying for a license.
> >>
> >> I have rebuilt my 30 server sorl 8 cluster using java11 and increased
> the
> >> java heap -Xms10433m -Xmx10433m and am seeing the same slowness that I
> >> was seeing with java12.
> >> I have not yet tried to build out the solr 8 collection with java8.
> >> Would it be worthwhile to do that or were you able to see anything in
> the
> >> logs?
> >>
> >> Thank you in advance,
> >> Russ
> >>
> >> *Manzama*a MODERN GOVERNANCE company
> >>
> >> Russell Bahr
> >> Lead Infrastructure Engineer
> >>
> >> USA & CAN Office: +1 (541) 306 3271
> >> USA & CAN Support: +1 (541) 706 9393
> >> UK Office & Support: +44 (0)203 282 1633
> >> AUS Office & Support: +61 (0) 2 8417 2339
> >>
> >> 543 NW York Drive, Suite 100, Bend, OR 97703
> >>
> >> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> >> <https://twitter.com/ManzamaInc> | Facebook
> >> <http://www.facebook.com/manzamainc> | YouTube
> >> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
> >>
> >>
> >> On Wed, Oct 16, 2019 at 11:50 AM Russell Bahr <ru...@manzama.com> wrote:
> >>
> >>> Hi Shawn,
> >>>
> >>> Just checking to see if you saw my reply and had any feedback. Thank
> you
> >>> again for your help. It is much appreciated.
> >>>
> >>> Thank you,
> >>>
> >>> Russ
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *From: *Russell Bahr <ru...@manzama.com>
> >>> *Date: *Tuesday, October 15, 2019 at 11:50 AM
> >>> *To: *"solr-user@lucene.apache.org" <so...@lucene.apache.org>
> >>> *Subject: *Re: solr 8.1.1 many time slower returning query results than
> >>> solr 4.10.4 or solr 6.5.1
> >>>
> >>>
> >>>
> >>> Hi Shawn,
> >>>
> >>> I included the wrong file for solr4 and did not realize until you
> >>> pointed out the heap size.  The correct file that is setting the Java
> >>> environment is "Solr 4 tomcat setenv" I have uploaded that to the
> shared
> >>> folder along with the requested screenshots "Solr 4 top
> screenshot","Solr 6
> >>> top screenshot","Solr 8 top screenshot".
> >>>
> >>>
> >>>
> >>> I have also uploaded the solr.log, solr_gc.log, and
> >>> solr_slow_requests.log from a 2 hour period of time where I was
> running the
> >>> email load test against the solr8 implementation in which the queued
> tasks
> >>> are taking too long to complete.
> >>>
> >>>
> >>>
> >>> solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10,
> >>> solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log
> >>>
> >>>
> >>>
> >>> Let me know if there is any other information that I can provide that
> >>> may help to work through this.
> >>>
> >>>
> >>>
> >>> *Manzama *a MODERN GOVERNANCE company
> >>>
> >>>
> >>>
> >>> Russell Bahr
> >>> Lead Infrastructure Engineer
> >>>
> >>> USA & CAN Office: +1 (541) 306 3271
> >>> USA & CAN Support: +1 (541) 706 9393
> >>> UK Office & Support: +44 (0)203 282 1633
> >>> AUS Office & Support: +61 (0) 2 8417 2339
> >>>
> >>> 543 NW York Drive, Suite 100, Bend, OR 97703
> >>>
> >>> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> >>> <https://twitter.com/ManzamaInc> | Facebook
> >>> <http://www.facebook.com/manzamainc> | YouTube
> >>> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org>
> >>> wrote:
> >>>
> >>> On 10/14/2019 1:36 PM, Russell Bahr wrote:
> >>> > Backend replacement of solr4 and hopefully Frontend replacement as
> >>> well.
> >>> > solr-spec 8.1.1
> >>> > lucene-spec 8.1.1
> >>> > Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
> >>> > 1 collection 6 shards 5 replicas per shard 17,919,889 current
> >>> documents (35 days worth of documents) - indexing new documents
> regularly
> >>> throughout the day, deleting aged out documents nightly.
> >>>
> >>> Java 12 is not recommended.  It is one of the "new feature" releases
> >>> that only gets 6 months of support.  We would recommend Java 8 or Java
> >>> 11.  These are the versions with long term support.  Probably a good
> >>> thing to be using OpenJDK, as the official Oracle Java now requires
> >>> paying for a license.
> >>>
> >>> Solr 8 ships with settings that enable the G1GC collector instead of
> >>> CMS, because CMS is deprecated and will disappear in a future Java
> >>> version.  We have seen problems with this when the system is
> >>> misconfigured as far as heap size.  When the system is properly sized,
> >>> G1 tends to do better than CMS, but when the heap is too large or too
> >>> small, has a tendency to amplify garbage collection problems in
> >>> comparison.
> >>>
> >>> Looking at your solr.in.sh files for each version ... the Solr 4
> >>> install
> >>> appears to be setting the heap to 512 megabytes.  This is definitely
> not
> >>> enough for millions of documents, and if this is what the heap size is
> >>> actually set to, would almost certainly run into memory errors
> >>> frequently and have absolutely terrible performance.  But you are
> saying
> >>> that it works well, so I don't think the heap is actually set to 512
> >>> megabytes.  Maybe the bin/solr script has been modified directly to set
> >>> the memory size instead of setting it in solr.in.sh where it should be
> >>> set.
> >>>
> >>> Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
> >>> size of just under 8 gigabytes.  With millions of documents, it is
> >>> likely that 8GB of heap is not quite big enough.
> >>>
> >>> For each of your installations (Solr 4, Solr 6, and Solr 8) can you
> >>> provide the screenshot described at this wiki page?
> >>>
> >>>
> >>>
> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
> >>>
> >>> It would also be helpful to see the GC logs from Solr 8.  We would need
> >>> at least one GC log, making sure that they cover at least a few hours,
> >>> including the timeframe when the slow indexing and slow queries were
> >>> observed.
> >>>
> >>> Thanks,
> >>> Shawn
> >>>
> >>>
>


-- 
Vincenzo D'Amore

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Russell Bahr <ru...@manzama.com>.
Hi Shawn,
Still hoping for some feedback here.  Should I not be replying to this
thread and instead create a new one?  As I do not see an improvement when
using java11 I am now going to rebuild again with java8 and solr 8.1.1.
Please respond and let me know if I am going in the right direction, or
should be attacking this in a different way.
Thank you,
Russ

*Manzama*a MODERN GOVERNANCE company

Russell Bahr
Lead Infrastructure Engineer

USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282 1633
AUS Office & Support: +61 (0) 2 8417 2339

543 NW York Drive, Suite 100, Bend, OR 97703

LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
<https://twitter.com/ManzamaInc> | Facebook
<http://www.facebook.com/manzamainc> | YouTube
<https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>


On Tue, Oct 22, 2019 at 1:18 PM Russell Bahr <ru...@manzama.com> wrote:

> Hi,
> Is there anyone that would be able to assist with the issue that I am
> seeing?
> I am seeing the same slowness with solr 8.1.1 using java11 as I am seeing
> with java12, over queries that are run from solr4.10.4 with java8 and
> solr6.5.1 with java8.
> Queries that return in less than half a second on solr4 are taking up to
> 20 seconds with same data indexed to solr 8.1.1
> I have posted configs, schemas, and various log files in this shared
> dropbox folder (
> https://www.dropbox.com/sh/2x2k5c9db7d4pt9/AADnHwuJc7a9Fh4KmUD15rS0a?dl=0
> ).
> Any additional help/assistance would be greatly appreciated.
> Thank you,
> Russ
>
> *Manzama*a MODERN GOVERNANCE company
>
> Russell Bahr
> Lead Infrastructure Engineer
>
> USA & CAN Office: +1 (541) 306 3271
> USA & CAN Support: +1 (541) 706 9393
> UK Office & Support: +44 (0)203 282 1633
> AUS Office & Support: +61 (0) 2 8417 2339
>
> 543 NW York Drive, Suite 100, Bend, OR 97703
>
> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> <https://twitter.com/ManzamaInc> | Facebook
> <http://www.facebook.com/manzamainc> | YouTube
> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>
>
> On Sun, Oct 20, 2019 at 8:54 PM Russell Bahr <ru...@manzama.com> wrote:
>
>> Hi Shawn,
>> per your comments from before
>>   On Oct 15, 2019, 2:28 AM, Shawn Heisey wrote:
>>   > Java 12 is not recommended.  It is one of the "new feature" releases
>>   > that only gets 6 months of support.  We would recommend Java 8 or Java
>>   > 11.  These are the versions with long term support.  Probably a good
>>   > thing to be using OpenJDK, as the official Oracle Java now requires
>>   > paying for a license.
>>
>> I have rebuilt my 30 server sorl 8 cluster using java11 and increased the
>> java heap -Xms10433m -Xmx10433m and am seeing the same slowness that I
>> was seeing with java12.
>> I have not yet tried to build out the solr 8 collection with java8.
>> Would it be worthwhile to do that or were you able to see anything in the
>> logs?
>>
>> Thank you in advance,
>> Russ
>>
>> *Manzama*a MODERN GOVERNANCE company
>>
>> Russell Bahr
>> Lead Infrastructure Engineer
>>
>> USA & CAN Office: +1 (541) 306 3271
>> USA & CAN Support: +1 (541) 706 9393
>> UK Office & Support: +44 (0)203 282 1633
>> AUS Office & Support: +61 (0) 2 8417 2339
>>
>> 543 NW York Drive, Suite 100, Bend, OR 97703
>>
>> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
>> <https://twitter.com/ManzamaInc> | Facebook
>> <http://www.facebook.com/manzamainc> | YouTube
>> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>>
>>
>> On Wed, Oct 16, 2019 at 11:50 AM Russell Bahr <ru...@manzama.com> wrote:
>>
>>> Hi Shawn,
>>>
>>> Just checking to see if you saw my reply and had any feedback. Thank you
>>> again for your help. It is much appreciated.
>>>
>>> Thank you,
>>>
>>> Russ
>>>
>>>
>>>
>>>
>>>
>>> *From: *Russell Bahr <ru...@manzama.com>
>>> *Date: *Tuesday, October 15, 2019 at 11:50 AM
>>> *To: *"solr-user@lucene.apache.org" <so...@lucene.apache.org>
>>> *Subject: *Re: solr 8.1.1 many time slower returning query results than
>>> solr 4.10.4 or solr 6.5.1
>>>
>>>
>>>
>>> Hi Shawn,
>>>
>>> I included the wrong file for solr4 and did not realize until you
>>> pointed out the heap size.  The correct file that is setting the Java
>>> environment is "Solr 4 tomcat setenv" I have uploaded that to the shared
>>> folder along with the requested screenshots "Solr 4 top screenshot","Solr 6
>>> top screenshot","Solr 8 top screenshot".
>>>
>>>
>>>
>>> I have also uploaded the solr.log, solr_gc.log, and
>>> solr_slow_requests.log from a 2 hour period of time where I was running the
>>> email load test against the solr8 implementation in which the queued tasks
>>> are taking too long to complete.
>>>
>>>
>>>
>>> solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10,
>>> solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log
>>>
>>>
>>>
>>> Let me know if there is any other information that I can provide that
>>> may help to work through this.
>>>
>>>
>>>
>>> *Manzama *a MODERN GOVERNANCE company
>>>
>>>
>>>
>>> Russell Bahr
>>> Lead Infrastructure Engineer
>>>
>>> USA & CAN Office: +1 (541) 306 3271
>>> USA & CAN Support: +1 (541) 706 9393
>>> UK Office & Support: +44 (0)203 282 1633
>>> AUS Office & Support: +61 (0) 2 8417 2339
>>>
>>> 543 NW York Drive, Suite 100, Bend, OR 97703
>>>
>>> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
>>> <https://twitter.com/ManzamaInc> | Facebook
>>> <http://www.facebook.com/manzamainc> | YouTube
>>> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org>
>>> wrote:
>>>
>>> On 10/14/2019 1:36 PM, Russell Bahr wrote:
>>> > Backend replacement of solr4 and hopefully Frontend replacement as
>>> well.
>>> > solr-spec 8.1.1
>>> > lucene-spec 8.1.1
>>> > Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
>>> > 1 collection 6 shards 5 replicas per shard 17,919,889 current
>>> documents (35 days worth of documents) - indexing new documents regularly
>>> throughout the day, deleting aged out documents nightly.
>>>
>>> Java 12 is not recommended.  It is one of the "new feature" releases
>>> that only gets 6 months of support.  We would recommend Java 8 or Java
>>> 11.  These are the versions with long term support.  Probably a good
>>> thing to be using OpenJDK, as the official Oracle Java now requires
>>> paying for a license.
>>>
>>> Solr 8 ships with settings that enable the G1GC collector instead of
>>> CMS, because CMS is deprecated and will disappear in a future Java
>>> version.  We have seen problems with this when the system is
>>> misconfigured as far as heap size.  When the system is properly sized,
>>> G1 tends to do better than CMS, but when the heap is too large or too
>>> small, has a tendency to amplify garbage collection problems in
>>> comparison.
>>>
>>> Looking at your solr.in.sh files for each version ... the Solr 4
>>> install
>>> appears to be setting the heap to 512 megabytes.  This is definitely not
>>> enough for millions of documents, and if this is what the heap size is
>>> actually set to, would almost certainly run into memory errors
>>> frequently and have absolutely terrible performance.  But you are saying
>>> that it works well, so I don't think the heap is actually set to 512
>>> megabytes.  Maybe the bin/solr script has been modified directly to set
>>> the memory size instead of setting it in solr.in.sh where it should be
>>> set.
>>>
>>> Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
>>> size of just under 8 gigabytes.  With millions of documents, it is
>>> likely that 8GB of heap is not quite big enough.
>>>
>>> For each of your installations (Solr 4, Solr 6, and Solr 8) can you
>>> provide the screenshot described at this wiki page?
>>>
>>>
>>> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
>>>
>>> It would also be helpful to see the GC logs from Solr 8.  We would need
>>> at least one GC log, making sure that they cover at least a few hours,
>>> including the timeframe when the slow indexing and slow queries were
>>> observed.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Russell Bahr <ru...@manzama.com>.
Hi,
Is there anyone that would be able to assist with the issue that I am
seeing?
I am seeing the same slowness with solr 8.1.1 using java11 as I am seeing
with java12, over queries that are run from solr4.10.4 with java8 and
solr6.5.1 with java8.
Queries that return in less than half a second on solr4 are taking up to 20
seconds with same data indexed to solr 8.1.1
I have posted configs, schemas, and various log files in this shared
dropbox folder (
https://www.dropbox.com/sh/2x2k5c9db7d4pt9/AADnHwuJc7a9Fh4KmUD15rS0a?dl=0 ).
Any additional help/assistance would be greatly appreciated.
Thank you,
Russ

*Manzama*a MODERN GOVERNANCE company

Russell Bahr
Lead Infrastructure Engineer

USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282 1633
AUS Office & Support: +61 (0) 2 8417 2339

543 NW York Drive, Suite 100, Bend, OR 97703

LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
<https://twitter.com/ManzamaInc> | Facebook
<http://www.facebook.com/manzamainc> | YouTube
<https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>


On Sun, Oct 20, 2019 at 8:54 PM Russell Bahr <ru...@manzama.com> wrote:

> Hi Shawn,
> per your comments from before
>   On Oct 15, 2019, 2:28 AM, Shawn Heisey wrote:
>   > Java 12 is not recommended.  It is one of the "new feature" releases
>   > that only gets 6 months of support.  We would recommend Java 8 or Java
>   > 11.  These are the versions with long term support.  Probably a good
>   > thing to be using OpenJDK, as the official Oracle Java now requires
>   > paying for a license.
>
> I have rebuilt my 30 server sorl 8 cluster using java11 and increased the
> java heap -Xms10433m -Xmx10433m and am seeing the same slowness that I
> was seeing with java12.
> I have not yet tried to build out the solr 8 collection with java8.  Would
> it be worthwhile to do that or were you able to see anything in the logs?
>
> Thank you in advance,
> Russ
>
> *Manzama*a MODERN GOVERNANCE company
>
> Russell Bahr
> Lead Infrastructure Engineer
>
> USA & CAN Office: +1 (541) 306 3271
> USA & CAN Support: +1 (541) 706 9393
> UK Office & Support: +44 (0)203 282 1633
> AUS Office & Support: +61 (0) 2 8417 2339
>
> 543 NW York Drive, Suite 100, Bend, OR 97703
>
> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> <https://twitter.com/ManzamaInc> | Facebook
> <http://www.facebook.com/manzamainc> | YouTube
> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>
>
> On Wed, Oct 16, 2019 at 11:50 AM Russell Bahr <ru...@manzama.com> wrote:
>
>> Hi Shawn,
>>
>> Just checking to see if you saw my reply and had any feedback. Thank you
>> again for your help. It is much appreciated.
>>
>> Thank you,
>>
>> Russ
>>
>>
>>
>>
>>
>> *From: *Russell Bahr <ru...@manzama.com>
>> *Date: *Tuesday, October 15, 2019 at 11:50 AM
>> *To: *"solr-user@lucene.apache.org" <so...@lucene.apache.org>
>> *Subject: *Re: solr 8.1.1 many time slower returning query results than
>> solr 4.10.4 or solr 6.5.1
>>
>>
>>
>> Hi Shawn,
>>
>> I included the wrong file for solr4 and did not realize until you pointed
>> out the heap size.  The correct file that is setting the Java environment
>> is "Solr 4 tomcat setenv" I have uploaded that to the shared folder along
>> with the requested screenshots "Solr 4 top screenshot","Solr 6 top
>> screenshot","Solr 8 top screenshot".
>>
>>
>>
>> I have also uploaded the solr.log, solr_gc.log, and
>> solr_slow_requests.log from a 2 hour period of time where I was running the
>> email load test against the solr8 implementation in which the queued tasks
>> are taking too long to complete.
>>
>>
>>
>> solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10,
>> solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log
>>
>>
>>
>> Let me know if there is any other information that I can provide that may
>> help to work through this.
>>
>>
>>
>> *Manzama *a MODERN GOVERNANCE company
>>
>>
>>
>> Russell Bahr
>> Lead Infrastructure Engineer
>>
>> USA & CAN Office: +1 (541) 306 3271
>> USA & CAN Support: +1 (541) 706 9393
>> UK Office & Support: +44 (0)203 282 1633
>> AUS Office & Support: +61 (0) 2 8417 2339
>>
>> 543 NW York Drive, Suite 100, Bend, OR 97703
>>
>> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
>> <https://twitter.com/ManzamaInc> | Facebook
>> <http://www.facebook.com/manzamainc> | YouTube
>> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>>
>>
>>
>>
>>
>> On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org> wrote:
>>
>> On 10/14/2019 1:36 PM, Russell Bahr wrote:
>> > Backend replacement of solr4 and hopefully Frontend replacement as well.
>> > solr-spec 8.1.1
>> > lucene-spec 8.1.1
>> > Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
>> > 1 collection 6 shards 5 replicas per shard 17,919,889 current documents
>> (35 days worth of documents) - indexing new documents regularly throughout
>> the day, deleting aged out documents nightly.
>>
>> Java 12 is not recommended.  It is one of the "new feature" releases
>> that only gets 6 months of support.  We would recommend Java 8 or Java
>> 11.  These are the versions with long term support.  Probably a good
>> thing to be using OpenJDK, as the official Oracle Java now requires
>> paying for a license.
>>
>> Solr 8 ships with settings that enable the G1GC collector instead of
>> CMS, because CMS is deprecated and will disappear in a future Java
>> version.  We have seen problems with this when the system is
>> misconfigured as far as heap size.  When the system is properly sized,
>> G1 tends to do better than CMS, but when the heap is too large or too
>> small, has a tendency to amplify garbage collection problems in
>> comparison.
>>
>> Looking at your solr.in.sh files for each version ... the Solr 4 install
>> appears to be setting the heap to 512 megabytes.  This is definitely not
>> enough for millions of documents, and if this is what the heap size is
>> actually set to, would almost certainly run into memory errors
>> frequently and have absolutely terrible performance.  But you are saying
>> that it works well, so I don't think the heap is actually set to 512
>> megabytes.  Maybe the bin/solr script has been modified directly to set
>> the memory size instead of setting it in solr.in.sh where it should be
>> set.
>>
>> Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
>> size of just under 8 gigabytes.  With millions of documents, it is
>> likely that 8GB of heap is not quite big enough.
>>
>> For each of your installations (Solr 4, Solr 6, and Solr 8) can you
>> provide the screenshot described at this wiki page?
>>
>>
>> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
>>
>> It would also be helpful to see the GC logs from Solr 8.  We would need
>> at least one GC log, making sure that they cover at least a few hours,
>> including the timeframe when the slow indexing and slow queries were
>> observed.
>>
>> Thanks,
>> Shawn
>>
>>

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Russell Bahr <ru...@manzama.com>.
Hi Shawn,
per your comments from before
  On Oct 15, 2019, 2:28 AM, Shawn Heisey wrote:
  > Java 12 is not recommended.  It is one of the "new feature" releases
  > that only gets 6 months of support.  We would recommend Java 8 or Java
  > 11.  These are the versions with long term support.  Probably a good
  > thing to be using OpenJDK, as the official Oracle Java now requires
  > paying for a license.

I have rebuilt my 30 server sorl 8 cluster using java11 and increased the
java heap -Xms10433m -Xmx10433m and am seeing the same slowness that I was
seeing with java12.
I have not yet tried to build out the solr 8 collection with java8.  Would
it be worthwhile to do that or were you able to see anything in the logs?

Thank you in advance,
Russ

*Manzama*a MODERN GOVERNANCE company

Russell Bahr
Lead Infrastructure Engineer

USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282 1633
AUS Office & Support: +61 (0) 2 8417 2339

543 NW York Drive, Suite 100, Bend, OR 97703

LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
<https://twitter.com/ManzamaInc> | Facebook
<http://www.facebook.com/manzamainc> | YouTube
<https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>


On Wed, Oct 16, 2019 at 11:50 AM Russell Bahr <ru...@manzama.com> wrote:

> Hi Shawn,
>
> Just checking to see if you saw my reply and had any feedback. Thank you
> again for your help. It is much appreciated.
>
> Thank you,
>
> Russ
>
>
>
>
>
> *From: *Russell Bahr <ru...@manzama.com>
> *Date: *Tuesday, October 15, 2019 at 11:50 AM
> *To: *"solr-user@lucene.apache.org" <so...@lucene.apache.org>
> *Subject: *Re: solr 8.1.1 many time slower returning query results than
> solr 4.10.4 or solr 6.5.1
>
>
>
> Hi Shawn,
>
> I included the wrong file for solr4 and did not realize until you pointed
> out the heap size.  The correct file that is setting the Java environment
> is "Solr 4 tomcat setenv" I have uploaded that to the shared folder along
> with the requested screenshots "Solr 4 top screenshot","Solr 6 top
> screenshot","Solr 8 top screenshot".
>
>
>
> I have also uploaded the solr.log, solr_gc.log, and solr_slow_requests.log
> from a 2 hour period of time where I was running the email load test
> against the solr8 implementation in which the queued tasks are taking too
> long to complete.
>
>
>
> solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10,
> solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log
>
>
>
> Let me know if there is any other information that I can provide that may
> help to work through this.
>
>
>
> *Manzama *a MODERN GOVERNANCE company
>
>
>
> Russell Bahr
> Lead Infrastructure Engineer
>
> USA & CAN Office: +1 (541) 306 3271
> USA & CAN Support: +1 (541) 706 9393
> UK Office & Support: +44 (0)203 282 1633
> AUS Office & Support: +61 (0) 2 8417 2339
>
> 543 NW York Drive, Suite 100, Bend, OR 97703
>
> LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
> <https://twitter.com/ManzamaInc> | Facebook
> <http://www.facebook.com/manzamainc> | YouTube
> <https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>
>
>
>
>
>
> On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org> wrote:
>
> On 10/14/2019 1:36 PM, Russell Bahr wrote:
> > Backend replacement of solr4 and hopefully Frontend replacement as well.
> > solr-spec 8.1.1
> > lucene-spec 8.1.1
> > Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
> > 1 collection 6 shards 5 replicas per shard 17,919,889 current documents
> (35 days worth of documents) - indexing new documents regularly throughout
> the day, deleting aged out documents nightly.
>
> Java 12 is not recommended.  It is one of the "new feature" releases
> that only gets 6 months of support.  We would recommend Java 8 or Java
> 11.  These are the versions with long term support.  Probably a good
> thing to be using OpenJDK, as the official Oracle Java now requires
> paying for a license.
>
> Solr 8 ships with settings that enable the G1GC collector instead of
> CMS, because CMS is deprecated and will disappear in a future Java
> version.  We have seen problems with this when the system is
> misconfigured as far as heap size.  When the system is properly sized,
> G1 tends to do better than CMS, but when the heap is too large or too
> small, has a tendency to amplify garbage collection problems in comparison.
>
> Looking at your solr.in.sh files for each version ... the Solr 4 install
> appears to be setting the heap to 512 megabytes.  This is definitely not
> enough for millions of documents, and if this is what the heap size is
> actually set to, would almost certainly run into memory errors
> frequently and have absolutely terrible performance.  But you are saying
> that it works well, so I don't think the heap is actually set to 512
> megabytes.  Maybe the bin/solr script has been modified directly to set
> the memory size instead of setting it in solr.in.sh where it should be
> set.
>
> Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
> size of just under 8 gigabytes.  With millions of documents, it is
> likely that 8GB of heap is not quite big enough.
>
> For each of your installations (Solr 4, Solr 6, and Solr 8) can you
> provide the screenshot described at this wiki page?
>
>
> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
>
> It would also be helpful to see the GC logs from Solr 8.  We would need
> at least one GC log, making sure that they cover at least a few hours,
> including the timeframe when the slow indexing and slow queries were
> observed.
>
> Thanks,
> Shawn
>
>

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Russell Bahr <ru...@manzama.com>.
Hi Shawn,
Just checking to see if you saw my reply and had any feedback. Thank you again for your help. It is much appreciated.
Thank you,
Russ


From: Russell Bahr <ru...@manzama.com>
Date: Tuesday, October 15, 2019 at 11:50 AM
To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
Subject: Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Hi Shawn,
I included the wrong file for solr4 and did not realize until you pointed out the heap size.  The correct file that is setting the Java environment is "Solr 4 tomcat setenv" I have uploaded that to the shared folder along with the requested screenshots "Solr 4 top screenshot","Solr 6 top screenshot","Solr 8 top screenshot".

I have also uploaded the solr.log, solr_gc.log, and solr_slow_requests.log from a 2 hour period of time where I was running the email load test against the solr8 implementation in which the queued tasks are taking too long to complete.

solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10, solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log

Let me know if there is any other information that I can provide that may help to work through this.

Manzama
a MODERN GOVERNANCE company

Russell Bahr
Lead Infrastructure Engineer

USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282 1633
AUS Office & Support: +61 (0) 2 8417 2339

543 NW York Drive, Suite 100, Bend, OR 97703

LinkedIn<http://www.linkedin.com/company/manzama> | Twitter<https://twitter.com/ManzamaInc> | Facebook<http://www.facebook.com/manzamainc> | YouTube<https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>


On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org>> wrote:
On 10/14/2019 1:36 PM, Russell Bahr wrote:
> Backend replacement of solr4 and hopefully Frontend replacement as well.
> solr-spec 8.1.1
> lucene-spec 8.1.1
> Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
> 1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 days worth of documents) - indexing new documents regularly throughout the day, deleting aged out documents nightly.

Java 12 is not recommended.  It is one of the "new feature" releases
that only gets 6 months of support.  We would recommend Java 8 or Java
11.  These are the versions with long term support.  Probably a good
thing to be using OpenJDK, as the official Oracle Java now requires
paying for a license.

Solr 8 ships with settings that enable the G1GC collector instead of
CMS, because CMS is deprecated and will disappear in a future Java
version.  We have seen problems with this when the system is
misconfigured as far as heap size.  When the system is properly sized,
G1 tends to do better than CMS, but when the heap is too large or too
small, has a tendency to amplify garbage collection problems in comparison.

Looking at your solr.in.sh<http://solr.in.sh> files for each version ... the Solr 4 install
appears to be setting the heap to 512 megabytes.  This is definitely not
enough for millions of documents, and if this is what the heap size is
actually set to, would almost certainly run into memory errors
frequently and have absolutely terrible performance.  But you are saying
that it works well, so I don't think the heap is actually set to 512
megabytes.  Maybe the bin/solr script has been modified directly to set
the memory size instead of setting it in solr.in.sh<http://solr.in.sh> where it should be set.

Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
size of just under 8 gigabytes.  With millions of documents, it is
likely that 8GB of heap is not quite big enough.

For each of your installations (Solr 4, Solr 6, and Solr 8) can you
provide the screenshot described at this wiki page?

https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue

It would also be helpful to see the GC logs from Solr 8.  We would need
at least one GC log, making sure that they cover at least a few hours,
including the timeframe when the slow indexing and slow queries were
observed.

Thanks,
Shawn

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Russell Bahr <ru...@manzama.com>.
Hi Shawn,
I included the wrong file for solr4 and did not realize until you pointed
out the heap size.  The correct file that is setting the Java environment
is "Solr 4 tomcat setenv" I have uploaded that to the shared folder along
with the requested screenshots "Solr 4 top screenshot","Solr 6 top
screenshot","Solr 8 top screenshot".

I have also uploaded the solr.log, solr_gc.log, and solr_slow_requests.log
from a 2 hour period of time where I was running the email load test
against the solr8 implementation in which the queued tasks are taking too
long to complete.

solr_gc.log, solr_gc.log.1, solr_gc.log.2, solr.log, solr.log.10,
solr.log.6, solr.log.7, solr.log.8, solr.log.9, solr_slow_requests.log

Let me know if there is any other information that I can provide that may
help to work through this.


*Manzama*a MODERN GOVERNANCE company

Russell Bahr
Lead Infrastructure Engineer

USA & CAN Office: +1 (541) 306 3271
USA & CAN Support: +1 (541) 706 9393
UK Office & Support: +44 (0)203 282 1633
AUS Office & Support: +61 (0) 2 8417 2339

543 NW York Drive, Suite 100, Bend, OR 97703

LinkedIn <http://www.linkedin.com/company/manzama> | Twitter
<https://twitter.com/ManzamaInc> | Facebook
<http://www.facebook.com/manzamainc> | YouTube
<https://www.youtube.com/channel/UCBo3QoqewyNoo7HiT_BFuRw>


On Tue, Oct 15, 2019 at 2:28 AM Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/14/2019 1:36 PM, Russell Bahr wrote:
> > Backend replacement of solr4 and hopefully Frontend replacement as well.
> > solr-spec 8.1.1
> > lucene-spec 8.1.1
> > Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
> > 1 collection 6 shards 5 replicas per shard 17,919,889 current documents
> (35 days worth of documents) - indexing new documents regularly throughout
> the day, deleting aged out documents nightly.
>
> Java 12 is not recommended.  It is one of the "new feature" releases
> that only gets 6 months of support.  We would recommend Java 8 or Java
> 11.  These are the versions with long term support.  Probably a good
> thing to be using OpenJDK, as the official Oracle Java now requires
> paying for a license.
>
> Solr 8 ships with settings that enable the G1GC collector instead of
> CMS, because CMS is deprecated and will disappear in a future Java
> version.  We have seen problems with this when the system is
> misconfigured as far as heap size.  When the system is properly sized,
> G1 tends to do better than CMS, but when the heap is too large or too
> small, has a tendency to amplify garbage collection problems in comparison.
>
> Looking at your solr.in.sh files for each version ... the Solr 4 install
> appears to be setting the heap to 512 megabytes.  This is definitely not
> enough for millions of documents, and if this is what the heap size is
> actually set to, would almost certainly run into memory errors
> frequently and have absolutely terrible performance.  But you are saying
> that it works well, so I don't think the heap is actually set to 512
> megabytes.  Maybe the bin/solr script has been modified directly to set
> the memory size instead of setting it in solr.in.sh where it should be
> set.
>
> Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap
> size of just under 8 gigabytes.  With millions of documents, it is
> likely that 8GB of heap is not quite big enough.
>
> For each of your installations (Solr 4, Solr 6, and Solr 8) can you
> provide the screenshot described at this wiki page?
>
>
> https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue
>
> It would also be helpful to see the GC logs from Solr 8.  We would need
> at least one GC log, making sure that they cover at least a few hours,
> including the timeframe when the slow indexing and slow queries were
> observed.
>
> Thanks,
> Shawn
>

Re: solr 8.1.1 many time slower returning query results than solr 4.10.4 or solr 6.5.1

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/14/2019 1:36 PM, Russell Bahr wrote:
> Backend replacement of solr4 and hopefully Frontend replacement as well.
> solr-spec 8.1.1
> lucene-spec 8.1.1
> Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
> 1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 days worth of documents) - indexing new documents regularly throughout the day, deleting aged out documents nightly.

Java 12 is not recommended.  It is one of the "new feature" releases 
that only gets 6 months of support.  We would recommend Java 8 or Java 
11.  These are the versions with long term support.  Probably a good 
thing to be using OpenJDK, as the official Oracle Java now requires 
paying for a license.

Solr 8 ships with settings that enable the G1GC collector instead of 
CMS, because CMS is deprecated and will disappear in a future Java 
version.  We have seen problems with this when the system is 
misconfigured as far as heap size.  When the system is properly sized, 
G1 tends to do better than CMS, but when the heap is too large or too 
small, has a tendency to amplify garbage collection problems in comparison.

Looking at your solr.in.sh files for each version ... the Solr 4 install 
appears to be setting the heap to 512 megabytes.  This is definitely not 
enough for millions of documents, and if this is what the heap size is 
actually set to, would almost certainly run into memory errors 
frequently and have absolutely terrible performance.  But you are saying 
that it works well, so I don't think the heap is actually set to 512 
megabytes.  Maybe the bin/solr script has been modified directly to set 
the memory size instead of setting it in solr.in.sh where it should be set.

Solr 6 has a heap size of just under 27 gigabytes.  Solr 8 has a heap 
size of just under 8 gigabytes.  With millions of documents, it is 
likely that 8GB of heap is not quite big enough.

For each of your installations (Solr 4, Solr 6, and Solr 8) can you 
provide the screenshot described at this wiki page?

https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems#SolrPerformanceProblems-Askingforhelponamemory/performanceissue

It would also be helpful to see the GC logs from Solr 8.  We would need 
at least one GC log, making sure that they cover at least a few hours, 
including the timeframe when the slow indexing and slow queries were 
observed.

Thanks,
Shawn