You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Daniel Bruegge <da...@googlemail.com> on 2012/01/17 13:49:19 UTC

really slow performance when trying to get facet.field

Hi,

I have 2 Solr-shards. One is filled with approx. 25mio documents (local
index 6GB), the other with 10mio documents (2.7GB size).
I am trying to create some kind of 'word cloud' to see the frequency of
words for a *text_general *field.
For this I am currently using a facet over this field and I am also
restricting the documents by using some other filters in the query.

The performance is really bad for the first call and then pretty fast for
the following calls.

The maximum Java heap size is 3G for each shard. Both shards are running on
the same physical server which has 12G RAM.

Question: Should I reduce the documents in one shard, so that the index is
equal or less the Java Heap size for this shard? Or is
there another method to avoid this slow calls?

Thank you

Daniel

Re: really slow performance when trying to get facet.field

Posted by Daniel Bruegge <da...@bruegge.eu>.

Evictions are 0 for all cache types.

Your server max heap space with 12G is pretty huge. Which is good I think.
The CPU on my server is a 8-Core Intel i7 965.

Commit frequency is low, because shards are added and old shards exist for
historical reasons. Old shards will be then cleaned after couple of months.

I will try to add maximum 15mio per shard and see what will happen here.

This thing is, that I will add more shards over time, so that I can handle
maybe 500-800mio documents. Maybe more. It depends.

On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dm...@gmail.com> wrote:

> Hi Daniel,
>
> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is
> over 6,5 million.
>
> Do you see any evictions in your caches? What kind of server is it, in
> terms of CPU and OS? How often do you commit to the index?
>
> Dmitry
>
> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
> daniel.bruegge@googlemail.com> wrote:
>
> > Hi Dmitry,
> >
> > I had everything on one Solr Instance before, but this got to heavy and I
> > had the same issue here, that the 1st facet.query was really slow.
> >
> > When querying the facet:
> > - facet.limit = 100
> >
> > Cache settings are like this:
> >
> >    <filterCache class="solr.FastLRUCache"
> >                 size="16384"
> >                 initialSize="4096"
> >                 autowarmCount="4096"/>
> >
> >    <queryResultCache class="solr.LRUCache"
> >                     size="512"
> >                     initialSize="512"
> >                     autowarmCount="0"/>
> >
> >    <documentCache class="solr.LRUCache"
> >                   size="512"
> >                   initialSize="512"
> >                   autowarmCount="0"/>
> >
> > How big was your index? Did it fit into the RAM which you gave the Solr
> > instance?
> >
> > Thanks
> >
> >
> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com>
> wrote:
> >
> > > I had a similar problem for a similar task. And in my case merging the
> > > results from two shards turned out to be a culprit. If you can
> logically
> > > store your data just in one shard, your faceting should become faster.
> > Size
> > > wise it should not be a problem for SOLR.
> > >
> > > Also, you didn't say anything about the facet.limit value, cache
> > > parameters, usage of filter queries. Some of these can be
> interconnected.
> > >
> > > Dmitry
> > >
> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
> > > daniel.bruegge@googlemail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents
> (local
> > > > index 6GB), the other with 10mio documents (2.7GB size).
> > > > I am trying to create some kind of 'word cloud' to see the frequency
> of
> > > > words for a *text_general *field.
> > > > For this I am currently using a facet over this field and I am also
> > > > restricting the documents by using some other filters in the query.
> > > >
> > > > The performance is really bad for the first call and then pretty fast
> > for
> > > > the following calls.
> > > >
> > > > The maximum Java heap size is 3G for each shard. Both shards are
> > running
> > > on
> > > > the same physical server which has 12G RAM.
> > > >
> > > > Question: Should I reduce the documents in one shard, so that the
> index
> > > is
> > > > equal or less the Java Heap size for this shard? Or is
> > > > there another method to avoid this slow calls?
> > > >
> > > > Thank you
> > > >
> > > > Daniel
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Dmitry Kan
> > >
> >
>
>
>
> --
> Regards,
>
> Dmitry Kan
>

Re: really slow performance when trying to get facet.field

Posted by Dmitry Kan <dm...@gmail.com>.

Sounds good! So the take away lesson here is to remember cache pre-warming.
And of course keep track of RAM allocation :)

On Tue, Jan 17, 2012 at 11:23 PM, Daniel Bruegge <
daniel.bruegge@googlemail.com> wrote:

> Ok, I have now changed the static warming in the solrconfig.xml using
> first- and newSearcher.
> "Content" is my field to facet on. Now the commits take longer, which is OK
> for me, but the searches are really faster right now. I also reduced the
> number of documents on my shards to 15mio/shard. So the index is about
> 3.5G, which fits also in my memory I hope.
>
>    <listener event="newSearcher" class="solr.QuerySenderListener">
>      <arr name="queries">
> <lst>
>    <str name="q">*:*</str>
>            <str name="facet">true</str>
>            <str name="facet.field">content</str>
>            <str name="facet.limit">1</str>
>            <str name="facet.mincount">1</str>
>        </lst>
>      </arr>
>    </listener>
>    <listener event="firstSearcher" class="solr.QuerySenderListener">
>      <arr name="queries">
>        <lst>
> <str name="q">*:*</str>
>            <str name="facet">true</str>
>            <str name="facet.field">content</str>
>            <str name="facet.limit">1</str>
>            <str name="facet.mincount">1</str>
>        </lst>
>      </arr>
>    </listener>
>
>
> On Tue, Jan 17, 2012 at 2:36 PM, Daniel Bruegge <
> daniel.bruegge@googlemail.com> wrote:
>
> > Evictions are 0 for all cache types.
> >
> > Your server max heap space with 12G is pretty huge. Which is good I
> think.
> > The CPU on my server is a 8-Core Intel i7 965.
> >
> > Commit frequency is low, because shards are added and old shards exist
> for
> > historical reasons. Old shards will be then cleaned after couple of
> months.
> >
> > I will try to add maximum 15mio per shard and see what will happen here.
> >
> > This thing is, that I will add more shards over time, so that I can
> handle
> > maybe 500-800mio documents. Maybe more. It depends.
> >
> > On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dm...@gmail.com>
> wrote:
> >
> >> Hi Daniel,
> >>
> >> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
> >> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
> >> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs
> is
> >> over 6,5 million.
> >>
> >> Do you see any evictions in your caches? What kind of server is it, in
> >> terms of CPU and OS? How often do you commit to the index?
> >>
> >> Dmitry
> >>
> >> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
> >> daniel.bruegge@googlemail.com> wrote:
> >>
> >> > Hi Dmitry,
> >> >
> >> > I had everything on one Solr Instance before, but this got to heavy
> and
> >> I
> >> > had the same issue here, that the 1st facet.query was really slow.
> >> >
> >> > When querying the facet:
> >> > - facet.limit = 100
> >> >
> >> > Cache settings are like this:
> >> >
> >> >    <filterCache class="solr.FastLRUCache"
> >> >                 size="16384"
> >> >                 initialSize="4096"
> >> >                 autowarmCount="4096"/>
> >> >
> >> >    <queryResultCache class="solr.LRUCache"
> >> >                     size="512"
> >> >                     initialSize="512"
> >> >                     autowarmCount="0"/>
> >> >
> >> >    <documentCache class="solr.LRUCache"
> >> >                   size="512"
> >> >                   initialSize="512"
> >> >                   autowarmCount="0"/>
> >> >
> >> > How big was your index? Did it fit into the RAM which you gave the
> Solr
> >> > instance?
> >> >
> >> > Thanks
> >> >
> >> >
> >> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com>
> >> wrote:
> >> >
> >> > > I had a similar problem for a similar task. And in my case merging
> the
> >> > > results from two shards turned out to be a culprit. If you can
> >> logically
> >> > > store your data just in one shard, your faceting should become
> faster.
> >> > Size
> >> > > wise it should not be a problem for SOLR.
> >> > >
> >> > > Also, you didn't say anything about the facet.limit value, cache
> >> > > parameters, usage of filter queries. Some of these can be
> >> interconnected.
> >> > >
> >> > > Dmitry
> >> > >
> >> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
> >> > > daniel.bruegge@googlemail.com> wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents
> >> (local
> >> > > > index 6GB), the other with 10mio documents (2.7GB size).
> >> > > > I am trying to create some kind of 'word cloud' to see the
> >> frequency of
> >> > > > words for a *text_general *field.
> >> > > > For this I am currently using a facet over this field and I am
> also
> >> > > > restricting the documents by using some other filters in the
> query.
> >> > > >
> >> > > > The performance is really bad for the first call and then pretty
> >> fast
> >> > for
> >> > > > the following calls.
> >> > > >
> >> > > > The maximum Java heap size is 3G for each shard. Both shards are
> >> > running
> >> > > on
> >> > > > the same physical server which has 12G RAM.
> >> > > >
> >> > > > Question: Should I reduce the documents in one shard, so that the
> >> index
> >> > > is
> >> > > > equal or less the Java Heap size for this shard? Or is
> >> > > > there another method to avoid this slow calls?
> >> > > >
> >> > > > Thank you
> >> > > >
> >> > > > Daniel
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Regards,
> >> > >
> >> > > Dmitry Kan
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> Dmitry Kan
> >>
> >
> >
>



-- 
Regards,

Dmitry Kan

Re: really slow performance when trying to get facet.field

Posted by Daniel Bruegge <da...@googlemail.com>.

Ok, I have now changed the static warming in the solrconfig.xml using
first- and newSearcher.
"Content" is my field to facet on. Now the commits take longer, which is OK
for me, but the searches are really faster right now. I also reduced the
number of documents on my shards to 15mio/shard. So the index is about
3.5G, which fits also in my memory I hope.

    <listener event="newSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
<lst>
    <str name="q">*:*</str>
            <str name="facet">true</str>
            <str name="facet.field">content</str>
            <str name="facet.limit">1</str>
            <str name="facet.mincount">1</str>
        </lst>
      </arr>
    </listener>
    <listener event="firstSearcher" class="solr.QuerySenderListener">
      <arr name="queries">
        <lst>
<str name="q">*:*</str>
            <str name="facet">true</str>
            <str name="facet.field">content</str>
            <str name="facet.limit">1</str>
            <str name="facet.mincount">1</str>
        </lst>
      </arr>
    </listener>


On Tue, Jan 17, 2012 at 2:36 PM, Daniel Bruegge <
daniel.bruegge@googlemail.com> wrote:

> Evictions are 0 for all cache types.
>
> Your server max heap space with 12G is pretty huge. Which is good I think.
> The CPU on my server is a 8-Core Intel i7 965.
>
> Commit frequency is low, because shards are added and old shards exist for
> historical reasons. Old shards will be then cleaned after couple of months.
>
> I will try to add maximum 15mio per shard and see what will happen here.
>
> This thing is, that I will add more shards over time, so that I can handle
> maybe 500-800mio documents. Maybe more. It depends.
>
> On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dm...@gmail.com> wrote:
>
>> Hi Daniel,
>>
>> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
>> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
>> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is
>> over 6,5 million.
>>
>> Do you see any evictions in your caches? What kind of server is it, in
>> terms of CPU and OS? How often do you commit to the index?
>>
>> Dmitry
>>
>> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
>> daniel.bruegge@googlemail.com> wrote:
>>
>> > Hi Dmitry,
>> >
>> > I had everything on one Solr Instance before, but this got to heavy and
>> I
>> > had the same issue here, that the 1st facet.query was really slow.
>> >
>> > When querying the facet:
>> > - facet.limit = 100
>> >
>> > Cache settings are like this:
>> >
>> >    <filterCache class="solr.FastLRUCache"
>> >                 size="16384"
>> >                 initialSize="4096"
>> >                 autowarmCount="4096"/>
>> >
>> >    <queryResultCache class="solr.LRUCache"
>> >                     size="512"
>> >                     initialSize="512"
>> >                     autowarmCount="0"/>
>> >
>> >    <documentCache class="solr.LRUCache"
>> >                   size="512"
>> >                   initialSize="512"
>> >                   autowarmCount="0"/>
>> >
>> > How big was your index? Did it fit into the RAM which you gave the Solr
>> > instance?
>> >
>> > Thanks
>> >
>> >
>> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com>
>> wrote:
>> >
>> > > I had a similar problem for a similar task. And in my case merging the
>> > > results from two shards turned out to be a culprit. If you can
>> logically
>> > > store your data just in one shard, your faceting should become faster.
>> > Size
>> > > wise it should not be a problem for SOLR.
>> > >
>> > > Also, you didn't say anything about the facet.limit value, cache
>> > > parameters, usage of filter queries. Some of these can be
>> interconnected.
>> > >
>> > > Dmitry
>> > >
>> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
>> > > daniel.bruegge@googlemail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents
>> (local
>> > > > index 6GB), the other with 10mio documents (2.7GB size).
>> > > > I am trying to create some kind of 'word cloud' to see the
>> frequency of
>> > > > words for a *text_general *field.
>> > > > For this I am currently using a facet over this field and I am also
>> > > > restricting the documents by using some other filters in the query.
>> > > >
>> > > > The performance is really bad for the first call and then pretty
>> fast
>> > for
>> > > > the following calls.
>> > > >
>> > > > The maximum Java heap size is 3G for each shard. Both shards are
>> > running
>> > > on
>> > > > the same physical server which has 12G RAM.
>> > > >
>> > > > Question: Should I reduce the documents in one shard, so that the
>> index
>> > > is
>> > > > equal or less the Java Heap size for this shard? Or is
>> > > > there another method to avoid this slow calls?
>> > > >
>> > > > Thank you
>> > > >
>> > > > Daniel
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Regards,
>> > >
>> > > Dmitry Kan
>> > >
>> >
>>
>>
>>
>> --
>> Regards,
>>
>> Dmitry Kan
>>
>
>

Re: really slow performance when trying to get facet.field

Posted by Daniel Bruegge <da...@googlemail.com>.

Evictions are 0 for all cache types.

Your server max heap space with 12G is pretty huge. Which is good I think.
The CPU on my server is a 8-Core Intel i7 965.

Commit frequency is low, because shards are added and old shards exist for
historical reasons. Old shards will be then cleaned after couple of months.

I will try to add maximum 15mio per shard and see what will happen here.

This thing is, that I will add more shards over time, so that I can handle
maybe 500-800mio documents. Maybe more. It depends.

On Tue, Jan 17, 2012 at 2:14 PM, Dmitry Kan <dm...@gmail.com> wrote:

> Hi Daniel,
>
> My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
> beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
> -Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is
> over 6,5 million.
>
> Do you see any evictions in your caches? What kind of server is it, in
> terms of CPU and OS? How often do you commit to the index?
>
> Dmitry
>
> On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
> daniel.bruegge@googlemail.com> wrote:
>
> > Hi Dmitry,
> >
> > I had everything on one Solr Instance before, but this got to heavy and I
> > had the same issue here, that the 1st facet.query was really slow.
> >
> > When querying the facet:
> > - facet.limit = 100
> >
> > Cache settings are like this:
> >
> >    <filterCache class="solr.FastLRUCache"
> >                 size="16384"
> >                 initialSize="4096"
> >                 autowarmCount="4096"/>
> >
> >    <queryResultCache class="solr.LRUCache"
> >                     size="512"
> >                     initialSize="512"
> >                     autowarmCount="0"/>
> >
> >    <documentCache class="solr.LRUCache"
> >                   size="512"
> >                   initialSize="512"
> >                   autowarmCount="0"/>
> >
> > How big was your index? Did it fit into the RAM which you gave the Solr
> > instance?
> >
> > Thanks
> >
> >
> > On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com>
> wrote:
> >
> > > I had a similar problem for a similar task. And in my case merging the
> > > results from two shards turned out to be a culprit. If you can
> logically
> > > store your data just in one shard, your faceting should become faster.
> > Size
> > > wise it should not be a problem for SOLR.
> > >
> > > Also, you didn't say anything about the facet.limit value, cache
> > > parameters, usage of filter queries. Some of these can be
> interconnected.
> > >
> > > Dmitry
> > >
> > > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
> > > daniel.bruegge@googlemail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have 2 Solr-shards. One is filled with approx. 25mio documents
> (local
> > > > index 6GB), the other with 10mio documents (2.7GB size).
> > > > I am trying to create some kind of 'word cloud' to see the frequency
> of
> > > > words for a *text_general *field.
> > > > For this I am currently using a facet over this field and I am also
> > > > restricting the documents by using some other filters in the query.
> > > >
> > > > The performance is really bad for the first call and then pretty fast
> > for
> > > > the following calls.
> > > >
> > > > The maximum Java heap size is 3G for each shard. Both shards are
> > running
> > > on
> > > > the same physical server which has 12G RAM.
> > > >
> > > > Question: Should I reduce the documents in one shard, so that the
> index
> > > is
> > > > equal or less the Java Heap size for this shard? Or is
> > > > there another method to avoid this slow calls?
> > > >
> > > > Thank you
> > > >
> > > > Daniel
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Dmitry Kan
> > >
> >
>
>
>
> --
> Regards,
>
> Dmitry Kan
>

Re: really slow performance when trying to get facet.field

Posted by Dmitry Kan <dm...@gmail.com>.

Hi Daniel,

My index is 6,5G. I'm sure it can be bigger. facet.limit we ask for is
beyond 100 thousand. It is sub-second speed. I run it with -Xms1024m
-Xmx12000m under tomcat, it currently takes 5,4G of RAM. Amount of docs is
over 6,5 million.

Do you see any evictions in your caches? What kind of server is it, in
terms of CPU and OS? How often do you commit to the index?

Dmitry

On Tue, Jan 17, 2012 at 3:01 PM, Daniel Bruegge <
daniel.bruegge@googlemail.com> wrote:

> Hi Dmitry,
>
> I had everything on one Solr Instance before, but this got to heavy and I
> had the same issue here, that the 1st facet.query was really slow.
>
> When querying the facet:
> - facet.limit = 100
>
> Cache settings are like this:
>
>    <filterCache class="solr.FastLRUCache"
>                 size="16384"
>                 initialSize="4096"
>                 autowarmCount="4096"/>
>
>    <queryResultCache class="solr.LRUCache"
>                     size="512"
>                     initialSize="512"
>                     autowarmCount="0"/>
>
>    <documentCache class="solr.LRUCache"
>                   size="512"
>                   initialSize="512"
>                   autowarmCount="0"/>
>
> How big was your index? Did it fit into the RAM which you gave the Solr
> instance?
>
> Thanks
>
>
> On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com> wrote:
>
> > I had a similar problem for a similar task. And in my case merging the
> > results from two shards turned out to be a culprit. If you can logically
> > store your data just in one shard, your faceting should become faster.
> Size
> > wise it should not be a problem for SOLR.
> >
> > Also, you didn't say anything about the facet.limit value, cache
> > parameters, usage of filter queries. Some of these can be interconnected.
> >
> > Dmitry
> >
> > On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
> > daniel.bruegge@googlemail.com> wrote:
> >
> > > Hi,
> > >
> > > I have 2 Solr-shards. One is filled with approx. 25mio documents (local
> > > index 6GB), the other with 10mio documents (2.7GB size).
> > > I am trying to create some kind of 'word cloud' to see the frequency of
> > > words for a *text_general *field.
> > > For this I am currently using a facet over this field and I am also
> > > restricting the documents by using some other filters in the query.
> > >
> > > The performance is really bad for the first call and then pretty fast
> for
> > > the following calls.
> > >
> > > The maximum Java heap size is 3G for each shard. Both shards are
> running
> > on
> > > the same physical server which has 12G RAM.
> > >
> > > Question: Should I reduce the documents in one shard, so that the index
> > is
> > > equal or less the Java Heap size for this shard? Or is
> > > there another method to avoid this slow calls?
> > >
> > > Thank you
> > >
> > > Daniel
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Dmitry Kan
> >
>



-- 
Regards,

Dmitry Kan

Re: really slow performance when trying to get facet.field

Posted by Daniel Bruegge <da...@googlemail.com>.

Hi Dmitry,

I had everything on one Solr Instance before, but this got to heavy and I
had the same issue here, that the 1st facet.query was really slow.

When querying the facet:
- facet.limit = 100

Cache settings are like this:

    <filterCache class="solr.FastLRUCache"
                 size="16384"
                 initialSize="4096"
                 autowarmCount="4096"/>

    <queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>

    <documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>

How big was your index? Did it fit into the RAM which you gave the Solr
instance?

Thanks


On Tue, Jan 17, 2012 at 1:56 PM, Dmitry Kan <dm...@gmail.com> wrote:

> I had a similar problem for a similar task. And in my case merging the
> results from two shards turned out to be a culprit. If you can logically
> store your data just in one shard, your faceting should become faster. Size
> wise it should not be a problem for SOLR.
>
> Also, you didn't say anything about the facet.limit value, cache
> parameters, usage of filter queries. Some of these can be interconnected.
>
> Dmitry
>
> On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
> daniel.bruegge@googlemail.com> wrote:
>
> > Hi,
> >
> > I have 2 Solr-shards. One is filled with approx. 25mio documents (local
> > index 6GB), the other with 10mio documents (2.7GB size).
> > I am trying to create some kind of 'word cloud' to see the frequency of
> > words for a *text_general *field.
> > For this I am currently using a facet over this field and I am also
> > restricting the documents by using some other filters in the query.
> >
> > The performance is really bad for the first call and then pretty fast for
> > the following calls.
> >
> > The maximum Java heap size is 3G for each shard. Both shards are running
> on
> > the same physical server which has 12G RAM.
> >
> > Question: Should I reduce the documents in one shard, so that the index
> is
> > equal or less the Java Heap size for this shard? Or is
> > there another method to avoid this slow calls?
> >
> > Thank you
> >
> > Daniel
> >
>
>
>
> --
> Regards,
>
> Dmitry Kan
>

Re: really slow performance when trying to get facet.field

Posted by Dmitry Kan <dm...@gmail.com>.

I had a similar problem for a similar task. And in my case merging the
results from two shards turned out to be a culprit. If you can logically
store your data just in one shard, your faceting should become faster. Size
wise it should not be a problem for SOLR.

Also, you didn't say anything about the facet.limit value, cache
parameters, usage of filter queries. Some of these can be interconnected.

Dmitry

On Tue, Jan 17, 2012 at 2:49 PM, Daniel Bruegge <
daniel.bruegge@googlemail.com> wrote:

> Hi,
>
> I have 2 Solr-shards. One is filled with approx. 25mio documents (local
> index 6GB), the other with 10mio documents (2.7GB size).
> I am trying to create some kind of 'word cloud' to see the frequency of
> words for a *text_general *field.
> For this I am currently using a facet over this field and I am also
> restricting the documents by using some other filters in the query.
>
> The performance is really bad for the first call and then pretty fast for
> the following calls.
>
> The maximum Java heap size is 3G for each shard. Both shards are running on
> the same physical server which has 12G RAM.
>
> Question: Should I reduce the documents in one shard, so that the index is
> equal or less the Java Heap size for this shard? Or is
> there another method to avoid this slow calls?
>
> Thank you
>
> Daniel
>



-- 
Regards,

Dmitry Kan