You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aeroox Aeroox <ae...@gmail.com> on 2012/11/12 20:39:38 UTC
How to speed up Facet count (Big index) ??!!!!
Hi folks,
I have a solr index with up to 50M documents. A document contain 62 fields
(docid, name, location....).
The facet count took 1 to 2 minutes with this params :
http://XXXX.../select/?q=solr&
version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
and here is how look my schoolname_hl in solr schema :
<field name="schoolname_hl" type="text_hl" indexed="true" stored="false"
multiValued="true"/>
text_hl is defined like this:
<fieldType name="text_hl" class="solr.TextField" sortMissingLast="true"
omitNorms="true">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.TrimFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
And my cache policy :
<filterCache class="solr.FastLRUCache"
size="4096"
initialSize="4096"
autowarmCount="4096"/>
<queryResultCache class="solr.LRUCache"
size="5000"
initialSize="5000"
autowarmCount="5000"/>
<documentCache class="solr.LRUCache"
size="512"
initialSize="512"
autowarmCount="0"/>
I'm doing something wrong?
How i can speed up the facet count process in my case ?
for the record :
* i'm using solr 1.4 (LUCENE_36)
* 64GB Ram (with 60GB allocated to java/tomcat6)
Thanks in advance.
With love from Paris
Re: How to speed up Facet count (Big index) ??!!!!
Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,
Have you tried the other facet method or newer Solr?
Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 12, 2012 2:40 PM, "Aeroox Aeroox" <ae...@gmail.com> wrote:
> Hi folks,
>
> I have a solr index with up to 50M documents. A document contain 62 fields
> (docid, name, location....).
>
> The facet count took 1 to 2 minutes with this params :
>
> http://XXXX.../select/?q=solr&
>
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
>
> and here is how look my schoolname_hl in solr schema :
>
> <field name="schoolname_hl" type="text_hl" indexed="true" stored="false"
> multiValued="true"/>
>
> text_hl is defined like this:
>
> <fieldType name="text_hl" class="solr.TextField" sortMissingLast="true"
> omitNorms="true">
> <analyzer type="index">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
> <filter class="solr.StandardFilterFactory"/>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
> />
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.ASCIIFoldingFilterFactory"/>
> <filter class="solr.StandardFilterFactory"/>
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> <filter class="solr.StopFilterFactory"
> ignoreCase="true"
> words="stopwords.txt"
> enablePositionIncrements="true"
> />
> <filter class="solr.TrimFilterFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> </analyzer>
> </fieldType>
>
>
> And my cache policy :
>
> <filterCache class="solr.FastLRUCache"
> size="4096"
> initialSize="4096"
> autowarmCount="4096"/>
>
> <queryResultCache class="solr.LRUCache"
> size="5000"
> initialSize="5000"
> autowarmCount="5000"/>
>
> <documentCache class="solr.LRUCache"
> size="512"
> initialSize="512"
> autowarmCount="0"/>
>
>
> I'm doing something wrong?
>
> How i can speed up the facet count process in my case ?
>
> for the record :
>
> * i'm using solr 1.4 (LUCENE_36)
> * 64GB Ram (with 60GB allocated to java/tomcat6)
>
> Thanks in advance.
>
> With love from Paris
>
Re: How to speed up Facet count (Big index) ??!!!!
Posted by Upayavira <uv...@odoko.co.uk>.
I'd say you are at a point where sharding may well help. But, as others
have suggested, you have other issues to consider first - less memory
for Solr and upgrade to a more modern Solr.
Also, if as Yonik asks only the first query is slow, you can set up a
NewSearcher query in your solrconfig.xml to run this first query on
every commit, meaning your users will always get faster queries.
Upayavira
On Tue, Nov 13, 2012, at 11:16 AM, Aeroox Aeroox wrote:
> Thanks Yonik.
>
> Should I consider sharding in this case ( actually I have one big index
> with replication) ? Or create 2 index (one for search and other for facet
> on a different machine) ?
>
> Thanks folks
>
> With love from Paris (it's raining today :(
>
> Le mardi 13 novembre 2012, Yonik Seeley a écrit :
>
> > On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <aeroox7@gmail.com<javascript:;>>
> > wrote:
> > > Hi folks,
> > >
> > > I have a solr index with up to 50M documents. A document contain 62
> > fields
> > > (docid, name, location....).
> > >
> > > The facet count took 1 to 2 minutes with this params :
> > >
> > > http://XXXX.../select/?q=solr&
> > >
> > version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
> >
> > It should hopefully just take that long the first time? How much time
> > does it take to facet on the same field subsequent times?
> >
> > > And my cache policy :
> > >
> > > <filterCache class="solr.FastLRUCache"
> > > size="4096"
> > > initialSize="4096"
> > > autowarmCount="4096"/>
> > >
> > > <queryResultCache class="solr.LRUCache"
> > > size="5000"
> > > initialSize="5000"
> > > autowarmCount="5000"/>
> >
> > These are relatively big caches - consider reducing them if you can.
> > Especially the filter cache, depending on what percent of the entries
> > are bitsets.
> > Worst case would be 50M / 8 * 4096 = 25GB of bitsets.
> >
> > > * i'm using solr 1.4 (LUCENE_36)
> > > * 64GB Ram (with 60GB allocated to java/tomcat6)
> >
> > Reduce this if you can - it doesn't leave enough memory for the OS to
> > cache the index files and can contribute to slowness (more disk IO).
> >
> > -Yonik
> > http://lucidworks.com
> >
Re: How to speed up Facet count (Big index) ??!!!!
Posted by Aeroox Aeroox <ae...@gmail.com>.
Thanks Yonik.
Should I consider sharding in this case ( actually I have one big index
with replication) ? Or create 2 index (one for search and other for facet
on a different machine) ?
Thanks folks
With love from Paris (it's raining today :(
Le mardi 13 novembre 2012, Yonik Seeley a écrit :
> On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <aeroox7@gmail.com<javascript:;>>
> wrote:
> > Hi folks,
> >
> > I have a solr index with up to 50M documents. A document contain 62
> fields
> > (docid, name, location....).
> >
> > The facet count took 1 to 2 minutes with this params :
> >
> > http://XXXX.../select/?q=solr&
> >
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
>
> It should hopefully just take that long the first time? How much time
> does it take to facet on the same field subsequent times?
>
> > And my cache policy :
> >
> > <filterCache class="solr.FastLRUCache"
> > size="4096"
> > initialSize="4096"
> > autowarmCount="4096"/>
> >
> > <queryResultCache class="solr.LRUCache"
> > size="5000"
> > initialSize="5000"
> > autowarmCount="5000"/>
>
> These are relatively big caches - consider reducing them if you can.
> Especially the filter cache, depending on what percent of the entries
> are bitsets.
> Worst case would be 50M / 8 * 4096 = 25GB of bitsets.
>
> > * i'm using solr 1.4 (LUCENE_36)
> > * 64GB Ram (with 60GB allocated to java/tomcat6)
>
> Reduce this if you can - it doesn't leave enough memory for the OS to
> cache the index files and can contribute to slowness (more disk IO).
>
> -Yonik
> http://lucidworks.com
>
Re: How to speed up Facet count (Big index) ??!!!!
Posted by Yonik Seeley <yo...@lucidworks.com>.
On Mon, Nov 12, 2012 at 8:39 PM, Aeroox Aeroox <ae...@gmail.com> wrote:
> Hi folks,
>
> I have a solr index with up to 50M documents. A document contain 62 fields
> (docid, name, location....).
>
> The facet count took 1 to 2 minutes with this params :
>
> http://XXXX.../select/?q=solr&
> version=2.2&start=0&rows=0&facet=true&facet.limit=6&facet.mincount=1&mm=3<-1&facet.field=schoolname_hl&facet.method=fc
It should hopefully just take that long the first time? How much time
does it take to facet on the same field subsequent times?
> And my cache policy :
>
> <filterCache class="solr.FastLRUCache"
> size="4096"
> initialSize="4096"
> autowarmCount="4096"/>
>
> <queryResultCache class="solr.LRUCache"
> size="5000"
> initialSize="5000"
> autowarmCount="5000"/>
These are relatively big caches - consider reducing them if you can.
Especially the filter cache, depending on what percent of the entries
are bitsets.
Worst case would be 50M / 8 * 4096 = 25GB of bitsets.
> * i'm using solr 1.4 (LUCENE_36)
> * 64GB Ram (with 60GB allocated to java/tomcat6)
Reduce this if you can - it doesn't leave enough memory for the OS to
cache the index files and can contribute to slowness (more disk IO).
-Yonik
http://lucidworks.com