You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aeroox Aeroox <ae...@gmail.com> on 2012/11/12 20:39:38 UTC

How to speed up Facet count (Big index) ??!!!!

Hi folks,

I have a solr index with up to 50M documents. A document contain 62 fields
(docid, name, location....).

The facet count took 1 to 2 minutes with this params :

http://XXXX.../select/?q=solr&
version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc

and here is how look my schoolname_hl in solr schema :

<field name="schoolname_hl" type="text_hl" indexed="true" stored="false"
multiValued="true"/>

text_hl is defined like this:

<fieldType name="text_hl" class="solr.TextField" sortMissingLast="true"
omitNorms="true">
      <analyzer type="index">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
                />
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory"
                ignoreCase="true"
                words="stopwords.txt"
                enablePositionIncrements="true"
        />
        <filter class="solr.TrimFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
</fieldType>


And my cache policy :

<filterCache class="solr.FastLRUCache"
                 size="4096"
                 initialSize="4096"
                 autowarmCount="4096"/>

    <queryResultCache class="solr.LRUCache"
                     size="5000"
                     initialSize="5000"
                     autowarmCount="5000"/>

    <documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>


I'm doing something wrong?

How i can speed up the facet count process in my case ?

for the record :

* i'm using solr 1.4 (LUCENE_36)
* 64GB Ram (with 60GB allocated to java/tomcat6)

Thanks in advance.

With love from Paris

Re: How to speed up Facet count (Big index) ??!!!!

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

Have you tried the other facet method or newer Solr?

Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 12, 2012 2:40 PM, "Aeroox Aeroox" <ae...@gmail.com> wrote:

> Hi folks,
>
> I have a solr index with up to 50M documents. A document contain 62 fields
> (docid, name, location....).
>
> The facet count took 1 to 2 minutes with this params :
>
> http://XXXX.../select/?q=solr&
>
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
>
> and here is how look my schoolname_hl in solr schema :
>
> <field name="schoolname_hl" type="text_hl" indexed="true" stored="false"
> multiValued="true"/>
>
> text_hl is defined like this:
>
> <fieldType name="text_hl" class="solr.TextField" sortMissingLast="true"
> omitNorms="true">
>       <analyzer type="index">
>         <tokenizer class="solr.KeywordTokenizerFactory"/>
>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>         <filter class="solr.StandardFilterFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 enablePositionIncrements="true"
>                 />
>         <filter class="solr.TrimFilterFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.KeywordTokenizerFactory"/>
>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>         <filter class="solr.StandardFilterFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="stopwords.txt"
>                 enablePositionIncrements="true"
>         />
>         <filter class="solr.TrimFilterFactory"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>       </analyzer>
> </fieldType>
>
>
> And my cache policy :
>
> <filterCache class="solr.FastLRUCache"
>                  size="4096"
>                  initialSize="4096"
>                  autowarmCount="4096"/>
>
>     <queryResultCache class="solr.LRUCache"
>                      size="5000"
>                      initialSize="5000"
>                      autowarmCount="5000"/>
>
>     <documentCache class="solr.LRUCache"
>                    size="512"
>                    initialSize="512"
>                    autowarmCount="0"/>
>
>
> I'm doing something wrong?
>
> How i can speed up the facet count process in my case ?
>
> for the record :
>
> * i'm using solr 1.4 (LUCENE_36)
> * 64GB Ram (with 60GB allocated to java/tomcat6)
>
> Thanks in advance.
>
> With love from Paris
>

Re: How to speed up Facet count (Big index) ??!!!!

Posted by Upayavira <uv...@odoko.co.uk>.
I'd say you are at a point where sharding may well help. But, as others
have suggested, you have other issues to consider first - less memory
for Solr and upgrade to a more modern Solr. 

Also, if as Yonik asks only the first query is slow, you can set up a
NewSearcher query in your solrconfig.xml to run this first query on
every commit, meaning your users will always get faster queries.

Upayavira

On Tue, Nov 13, 2012, at 11:16 AM, Aeroox Aeroox wrote:
> Thanks  Yonik.
> 
> Should I consider sharding in this case ( actually I have one big index
> with replication) ? Or create 2 index (one for search and other for facet
> on a different machine) ?
> 
> Thanks folks
> 
> With love from Paris (it's raining today :(
> 
> Le mardi 13 novembre 2012, Yonik Seeley a écrit :
> 
> > On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <aeroox7@gmail.com<javascript:;>>
> > wrote:
> > > Hi folks,
> > >
> > > I have a solr index with up to 50M documents. A document contain 62
> > fields
> > > (docid, name, location....).
> > >
> > > The facet count took 1 to 2 minutes with this params :
> > >
> > > http://XXXX.../select/?q=solr&
> > >
> > version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
> >
> > It should hopefully just take that long the first time?  How much time
> > does it take to facet on the same field subsequent times?
> >
> > > And my cache policy :
> > >
> > > <filterCache class="solr.FastLRUCache"
> > >                  size="4096"
> > >                  initialSize="4096"
> > >                  autowarmCount="4096"/>
> > >
> > >     <queryResultCache class="solr.LRUCache"
> > >                      size="5000"
> > >                      initialSize="5000"
> > >                      autowarmCount="5000"/>
> >
> > These are relatively big caches - consider reducing them if you can.
> > Especially the filter cache, depending on what percent of the entries
> > are bitsets.
> > Worst case would be 50M / 8 * 4096 = 25GB of bitsets.
> >
> > > * i'm using solr 1.4 (LUCENE_36)
> > > * 64GB Ram (with 60GB allocated to java/tomcat6)
> >
> > Reduce this if you can - it doesn't leave enough memory for the OS to
> > cache the index files and can contribute to slowness (more disk IO).
> >
> > -Yonik
> > http://lucidworks.com
> >

Re: How to speed up Facet count (Big index) ??!!!!

Posted by Aeroox Aeroox <ae...@gmail.com>.
Thanks  Yonik.

Should I consider sharding in this case ( actually I have one big index
with replication) ? Or create 2 index (one for search and other for facet
on a different machine) ?

Thanks folks

With love from Paris (it's raining today :(

Le mardi 13 novembre 2012, Yonik Seeley a écrit :

> On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <aeroox7@gmail.com<javascript:;>>
> wrote:
> > Hi folks,
> >
> > I have a solr index with up to 50M documents. A document contain 62
> fields
> > (docid, name, location....).
> >
> > The facet count took 1 to 2 minutes with this params :
> >
> > http://XXXX.../select/?q=solr&
> >
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
>
> It should hopefully just take that long the first time?  How much time
> does it take to facet on the same field subsequent times?
>
> > And my cache policy :
> >
> > <filterCache class="solr.FastLRUCache"
> >                  size="4096"
> >                  initialSize="4096"
> >                  autowarmCount="4096"/>
> >
> >     <queryResultCache class="solr.LRUCache"
> >                      size="5000"
> >                      initialSize="5000"
> >                      autowarmCount="5000"/>
>
> These are relatively big caches - consider reducing them if you can.
> Especially the filter cache, depending on what percent of the entries
> are bitsets.
> Worst case would be 50M / 8 * 4096 = 25GB of bitsets.
>
> > * i'm using solr 1.4 (LUCENE_36)
> > * 64GB Ram (with 60GB allocated to java/tomcat6)
>
> Reduce this if you can - it doesn't leave enough memory for the OS to
> cache the index files and can contribute to slowness (more disk IO).
>
> -Yonik
> http://lucidworks.com
>

Re: How to speed up Facet count (Big index) ??!!!!

Posted by Yonik Seeley <yo...@lucidworks.com>.
On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <ae...@gmail.com> wrote:
> Hi folks,
>
> I have a solr index with up to 50M documents. A document contain 62 fields
> (docid, name, location....).
>
> The facet count took 1 to 2 minutes with this params :
>
> http://XXXX.../select/?q=solr&
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc

It should hopefully just take that long the first time?  How much time
does it take to facet on the same field subsequent times?

> And my cache policy :
>
> <filterCache class="solr.FastLRUCache"
>                  size="4096"
>                  initialSize="4096"
>                  autowarmCount="4096"/>
>
>     <queryResultCache class="solr.LRUCache"
>                      size="5000"
>                      initialSize="5000"
>                      autowarmCount="5000"/>

These are relatively big caches - consider reducing them if you can.
Especially the filter cache, depending on what percent of the entries
are bitsets.
Worst case would be 50M / 8 * 4096 = 25GB of bitsets.

> * i'm using solr 1.4 (LUCENE_36)
> * 64GB Ram (with 60GB allocated to java/tomcat6)

Reduce this if you can - it doesn't leave enough memory for the OS to
cache the index files and can contribute to slowness (more disk IO).

-Yonik
http://lucidworks.com