You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sivaprasad <si...@echidnainc.com> on 2013/08/19 14:23:58 UTC

Facing Solr performance during query search

Hi,

Last week we configured Solr master and slave set up. All the Solr search
requests are routed to slave. After this configuration, we are seeing
drastic performance problems with Solr.

Can any one explain what would be the reason?

And, how to disable optimizing the index, warming the searcher and cache on
Slave?

Regards,
Siva



--
View this message in context: http://lucene.472066.n3.nabble.com/Facing-Solr-performance-during-query-search-tp4085426.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facing Solr performance during query search

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Wed, 2013-08-21 at 10:09 +0200, sivaprasad wrote:
> The slave will poll for every 1hr. 

And are there normally changes?

> We have configured ~2000 facets and the machine configuration is given
> below.

I assume that you only request a subset of those facets at a time.

How much RAM does your machine have? 
How large is your index in GB?
How many documents do you have in your index?

As you are not explicitly warming your facets and since you have a lot
of them, my guess is that you're performing initializing facet calls all
the time. If the slave only has 32GB of RAM (and thus only about 10GB
for disk cache) and if your index is substantially larger than that, the
initialization will require a lot of non-cached disk access.

Try disabling the slave polling, then send 1000 queries and then re-send
the exact same 1000 queries. Are the response times satisfactory the
second time? If so, you should consider warming your facets and/or try
to come up with a solution where you don't have so many of them.

https://sbdevel.wordpress.com/2013/04/16/you-are-faceting-itwrong/

- Toke Eskildsen, State and University Library, Denmark


Re: Facing Solr performance during query search

Posted by Jack Krupansky <ja...@basetechnology.com>.
I'd like to see a screen shot of a search results web page that has 2,000 
facets.

-- Jack Krupansky

-----Original Message----- 
From: Erick Erickson
Sent: Wednesday, August 21, 2013 11:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Facing Solr performance during query search

~2,000 facets kind of worries me, but let's skip that for now.

Your original problem statement was that replication was the
thing that changed. So the first thing I'd do is not replicate. If you
turn it off, do your slaves still perform poorly?

Allocating that much RAM to the JVM is probably not a great idea,
see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

So far nothing is jumping out at me.


On Wed, Aug 21, 2013 at 4:09 AM, sivaprasad 
<si...@echidnainc.com>wrote:

> Here I am providing the slave solrconfig information.
> <indexConfig>
>        <commitLockTimeout>10000</commitLockTimeout>
>        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>           <int name="maxMergeAtOnce">35</int>
>           <int name="segmentsPerTier">35</int>
>        </mergePolicy>
>        <mergeScheduler
> class="org.apache.lucene.index.ConcurrentMergeScheduler">
>           <int name="maxMergeCount">6</int>
>           <int name="maxThreadCount">1</int>
>        </mergeScheduler>
> </indexConfig>
> <query>
>     <maxBooleanClauses>1024</maxBooleanClauses>
>    <queryResultWindowSize>20</queryResultWindowSize>
>     <listener event="newSearcher" class="solr.QuerySenderListener">
>       <arr name="queries">
>
>       </arr>
>     </listener>
>     <listener event="firstSearcher" class="solr.QuerySenderListener">
>       <arr name="queries">
>         <lst>
>           <str name="q">static firstSearcher warming in
> solrconfig.xml</str>
>         </lst>
>       </arr>
>     </listener>
>     <useColdSearcher>false</useColdSearcher>
> </query>
>
> The slave will poll for every 1hr.
>
> The field list is given below.
>
> <field name="product_id"  type="string" indexed="true" required="true"
> stored="true" />
>         <field name="product_name" type="text_en_splitting" indexed="true"
> stored="true"  omitNorms="false" termVectors="true"/>
>         <field name="prod_name" type="alphaOnlySort" indexed="true"
> stored="false"/>
>         <field name="product_desc" type="text_en_splitting" indexed="true"
> stored="true" omitNorms="false"/>
>         <field name="mpn" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="sku" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="upc" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="Brands" type="cat_text" indexed="true" stored="true"
> omitNorms="false" termVectors="true"/>
>         <field name="searchCategory" type="text_en_splitting"
> indexed="true"
> omitNorms="false"/>
>         <field name="category_id" type="int" indexed="true" 
> stored="true"/>
>         <field name="atom_id" type="int" indexed="false" stored="true"/>
>         <field name="sale_amount" type="double" indexed="true"
> stored="true"/>
>         <field name="image_url" type="string" indexed="false"
> stored="true"/>
>         <field name="Categories" type="string" indexed="true"
> stored="false"
> termVectors="true"/>
>         <field name="cataloged" type="string" indexed="false"
> stored="true"/>
>         <field name="product_rating" type="double" indexed="true"
> stored="true"/>
>         <field name="num_retailers" type="int" indexed="true"
> stored="true"/>
>         <field name="retailer_highest_rating" type="double" indexed="true"
> stored="true"/>
>         <field name="has_image" type="string" indexed="true"
> stored="true"/>
>         <field name="valid_prod_desc" type="string" indexed="true"
> stored="true"/>
>     <field name="reviewCount" type="int" indexed="false" stored="true"/>
>     <field name="offers" type="string" indexed="false" stored="true"
> multiValued="true"/>
>     <field name="Ratings" type="int" indexed="true" stored="true"/>
>         <dynamicField name="f_*" type="string" indexed="true"
> stored="false"/>
>         <field name="prod_spell" type="textSpell" indexed="true" />
>         <field name="_version_" type="long" indexed="true" stored="true"/>
>
> We have configured ~2000 facets and the machine configuration is given
> below.
>
> 6 core processor, 22528 GB RAM allotted to JVM . The solr version is 4.1.0
>
> Please let me know, if you require any more information.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Facing-Solr-performance-during-query-search-tp4085426p4085825.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 


Re: Facing Solr performance during query search

Posted by Erick Erickson <er...@gmail.com>.
~2,000 facets kind of worries me, but let's skip that for now.

Your original problem statement was that replication was the
thing that changed. So the first thing I'd do is not replicate. If you
turn it off, do your slaves still perform poorly?

Allocating that much RAM to the JVM is probably not a great idea,
see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

So far nothing is jumping out at me.


On Wed, Aug 21, 2013 at 4:09 AM, sivaprasad <si...@echidnainc.com>wrote:

> Here I am providing the slave solrconfig information.
> <indexConfig>
>        <commitLockTimeout>10000</commitLockTimeout>
>        <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>           <int name="maxMergeAtOnce">35</int>
>           <int name="segmentsPerTier">35</int>
>        </mergePolicy>
>        <mergeScheduler
> class="org.apache.lucene.index.ConcurrentMergeScheduler">
>           <int name="maxMergeCount">6</int>
>           <int name="maxThreadCount">1</int>
>        </mergeScheduler>
> </indexConfig>
> <query>
>     <maxBooleanClauses>1024</maxBooleanClauses>
>    <queryResultWindowSize>20</queryResultWindowSize>
>     <listener event="newSearcher" class="solr.QuerySenderListener">
>       <arr name="queries">
>
>       </arr>
>     </listener>
>     <listener event="firstSearcher" class="solr.QuerySenderListener">
>       <arr name="queries">
>         <lst>
>           <str name="q">static firstSearcher warming in
> solrconfig.xml</str>
>         </lst>
>       </arr>
>     </listener>
>     <useColdSearcher>false</useColdSearcher>
> </query>
>
> The slave will poll for every 1hr.
>
> The field list is given below.
>
> <field name="product_id"  type="string" indexed="true" required="true"
> stored="true" />
>         <field name="product_name" type="text_en_splitting" indexed="true"
> stored="true"  omitNorms="false" termVectors="true"/>
>         <field name="prod_name" type="alphaOnlySort" indexed="true"
> stored="false"/>
>         <field name="product_desc" type="text_en_splitting" indexed="true"
> stored="true" omitNorms="false"/>
>         <field name="mpn" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="sku" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="upc" type="string" indexed="false" stored="true"
> multiValued="true" />
>         <field name="Brands" type="cat_text" indexed="true" stored="true"
> omitNorms="false" termVectors="true"/>
>         <field name="searchCategory" type="text_en_splitting"
> indexed="true"
> omitNorms="false"/>
>         <field name="category_id" type="int" indexed="true" stored="true"/>
>         <field name="atom_id" type="int" indexed="false" stored="true"/>
>         <field name="sale_amount" type="double" indexed="true"
> stored="true"/>
>         <field name="image_url" type="string" indexed="false"
> stored="true"/>
>         <field name="Categories" type="string" indexed="true"
> stored="false"
> termVectors="true"/>
>         <field name="cataloged" type="string" indexed="false"
> stored="true"/>
>         <field name="product_rating" type="double" indexed="true"
> stored="true"/>
>         <field name="num_retailers" type="int" indexed="true"
> stored="true"/>
>         <field name="retailer_highest_rating" type="double" indexed="true"
> stored="true"/>
>         <field name="has_image" type="string" indexed="true"
> stored="true"/>
>         <field name="valid_prod_desc" type="string" indexed="true"
> stored="true"/>
>     <field name="reviewCount" type="int" indexed="false" stored="true"/>
>     <field name="offers" type="string" indexed="false" stored="true"
> multiValued="true"/>
>     <field name="Ratings" type="int" indexed="true" stored="true"/>
>         <dynamicField name="f_*" type="string" indexed="true"
> stored="false"/>
>         <field name="prod_spell" type="textSpell" indexed="true" />
>         <field name="_version_" type="long" indexed="true" stored="true"/>
>
> We have configured ~2000 facets and the machine configuration is given
> below.
>
> 6 core processor, 22528 GB RAM allotted to JVM . The solr version is 4.1.0
>
> Please let me know, if you require any more information.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Facing-Solr-performance-during-query-search-tp4085426p4085825.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Facing Solr performance during query search

Posted by sivaprasad <si...@echidnainc.com>.
Here I am providing the slave solrconfig information.
<indexConfig>    
       <commitLockTimeout>10000</commitLockTimeout>
       <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
          <int name="maxMergeAtOnce">35</int>
          <int name="segmentsPerTier">35</int>
       </mergePolicy>
       <mergeScheduler
class="org.apache.lucene.index.ConcurrentMergeScheduler">
       	  <int name="maxMergeCount">6</int>
          <int name="maxThreadCount">1</int>
       </mergeScheduler> 
</indexConfig>
<query>
    <maxBooleanClauses>1024</maxBooleanClauses>
   <queryResultWindowSize>20</queryResultWindowSize>
    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        
      </arr>
    </listener>
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst>
          <str name="q">static firstSearcher warming in solrconfig.xml</str>
        </lst>
      </arr>
    </listener>
    <useColdSearcher>false</useColdSearcher>
</query>

The slave will poll for every 1hr. 

The field list is given below.

<field name="product_id"  type="string" indexed="true" required="true"
stored="true" />
	<field name="product_name" type="text_en_splitting" indexed="true"
stored="true"  omitNorms="false" termVectors="true"/>
	<field name="prod_name" type="alphaOnlySort" indexed="true"
stored="false"/>
  	<field name="product_desc" type="text_en_splitting" indexed="true" 
stored="true" omitNorms="false"/>
  	<field name="mpn" type="string" indexed="false" stored="true" 
multiValued="true" />
  	<field name="sku" type="string" indexed="false" stored="true" 
multiValued="true" />
  	<field name="upc" type="string" indexed="false" stored="true" 
multiValued="true" />
  	<field name="Brands" type="cat_text" indexed="true" stored="true"
omitNorms="false" termVectors="true"/>
  	<field name="searchCategory" type="text_en_splitting" indexed="true" 
omitNorms="false"/>
  	<field name="category_id" type="int" indexed="true" stored="true"/>
  	<field name="atom_id" type="int" indexed="false" stored="true"/>
  	<field name="sale_amount" type="double" indexed="true" stored="true"/>
  	<field name="image_url" type="string" indexed="false" stored="true"/>  	
  	<field name="Categories" type="string" indexed="true" stored="false"
termVectors="true"/>
	<field name="cataloged" type="string" indexed="false" stored="true"/>
	<field name="product_rating" type="double" indexed="true" stored="true"/>
	<field name="num_retailers" type="int" indexed="true" stored="true"/> 
	<field name="retailer_highest_rating" type="double" indexed="true"
stored="true"/>
	<field name="has_image" type="string" indexed="true" stored="true"/>
	<field name="valid_prod_desc" type="string" indexed="true" stored="true"/>
    <field name="reviewCount" type="int" indexed="false" stored="true"/>
    <field name="offers" type="string" indexed="false" stored="true"
multiValued="true"/>
    <field name="Ratings" type="int" indexed="true" stored="true"/>
	<dynamicField name="f_*" type="string" indexed="true" stored="false"/>  	
	<field name="prod_spell" type="textSpell" indexed="true" />
  	<field name="_version_" type="long" indexed="true" stored="true"/>

We have configured ~2000 facets and the machine configuration is given
below.

6 core processor, 22528 GB RAM allotted to JVM . The solr version is 4.1.0

Please let me know, if you require any more information.



--
View this message in context: http://lucene.472066.n3.nabble.com/Facing-Solr-performance-during-query-search-tp4085426p4085825.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Facing Solr performance during query search

Posted by Erick Erickson <er...@gmail.com>.
Not until you tell us a lot more about your symptoms. What are your
replication intervals? autowarm settings? how are you measuring
"drastic" reductions? What have you tried in terms of diagnosing
the problem?

Please review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick


On Mon, Aug 19, 2013 at 8:23 AM, sivaprasad <si...@echidnainc.com>wrote:

> Hi,
>
> Last week we configured Solr master and slave set up. All the Solr search
> requests are routed to slave. After this configuration, we are seeing
> drastic performance problems with Solr.
>
> Can any one explain what would be the reason?
>
> And, how to disable optimizing the index, warming the searcher and cache on
> Slave?
>
> Regards,
> Siva
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Facing-Solr-performance-during-query-search-tp4085426.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>