You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Janne Majaranta <ja...@gmail.com> on 2011/05/26 14:31:01 UTC

SOLR multithreaded faceting

Hi,

I'm experimenting with making field count faceting ("getFacetFieldCounts()")
multithreaded by defining a Callable for each field to run the
faceting-algorithms, and by running these with the ExecutorService having a
fixed threadpool with number of threads =
Runtime.getRuntime().availableProcessors().

In initial tests, when faceting over multiple fields, it seems like a quite
nice speed improvement for counting the facets. And it is a extremely simple
and small addition to the code.

Test index 9M docs / 9 GB, faceting over single valued fields of type
"string" :

A = Solr 3.1
B = Solr 3.1 with parallel per field faceting

In these tests "B" had the index opened via a Windows Network Share from
"A".

A => facet.method=fc, facet field count 8, avg. 840ms
B => facet.method=fc, facet field count 8, avg. 118ms

A => facet field count 8, avg. 120ms
B => facet.method=enum, facet field count 8, avg. 35ms

A => facet.method=fc, facet field count 4, avg. 470ms
B => facet.method=fc, facet field count 4, avg. 80ms

A => facet.method=enum, facet field count 4, avg. 110ms
B => facet.method=enum, facet.field count 4, avg. 30ms

A => facet.method=fc, facet field count 2, avg. 234ms
B => facet.method=fc, facet field count 2, avg. 60ms

A => facet.method=enum, facet field count 2, avg.80ms
B => facet.method=enum, facet field count 2, avg. 40ms

Tested with Chrome repeating CTRL+F5 randomly...

Now, a couple of questions :
- Is somebody already working on adding multithreading to counting the
facets ?
- Is there a reason to not letting the facet counting methods run in
parallel ? i.e. is this a bad idea ?

I see that there is some multithreading work going into SOLR 4.0 with per
segment single valued faceting, is there a bigger plan already on making the
other faceting methods multithreaded also ?

Thanks,

Janne

Re: SOLR multithreaded faceting

Posted by Janne Majaranta <ja...@gmail.com>.
Ok, the issue and patches for TRUNK and 3.1 are in JIRA now :
https://issues.apache.org/jira/browse/SOLR-2548

<https://issues.apache.org/jira/browse/SOLR-2548>Thanks,

Janne


2011/5/26 Ryan McKinley <ry...@gmail.com>

> just make a note when you submit the patch...
>
> thanks
> ryan
>
>
>
> On Thu, May 26, 2011 at 11:12 AM, Janne Majaranta
> <ja...@gmail.com> wrote:
> > Yeah, sure. I'll add a JIRA issue and a patch later today. I assume the
> > patch should be for the TRUNK version. Since I'm experimenting with the
> 3.1
> > branch, is there a naming convention for patches for this branch?
> >
> > -Janne
> >
> > On May 26, 2011 4:40 PM, "Ryan McKinley" <ry...@gmail.com> wrote:
> >> Hi Janne-
> >>
> >> This sounds excellent.
> >>
> >>> Now, a couple of questions :
> >>> - Is somebody already working on adding multithreading to counting the
> >>> facets ?
> >>
> >> I don't think so
> >>
> >>> - Is there a reason to not letting the facet counting methods run in
> >>> parallel ? i.e. is this a bad idea ?
> >>
> >> Like most things, there are tradeoffs -- in some instances spawning
> >> multiple threads to respond to a query may be a bad idea, but in
> >> others it would be great.
> >>
> >>
> >> Can you open a JIRA issue and attach a patch?
> >>
> >> ryan
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Re: SOLR multithreaded faceting

Posted by Ryan McKinley <ry...@gmail.com>.
just make a note when you submit the patch...

thanks
ryan



On Thu, May 26, 2011 at 11:12 AM, Janne Majaranta
<ja...@gmail.com> wrote:
> Yeah, sure. I'll add a JIRA issue and a patch later today. I assume the
> patch should be for the TRUNK version. Since I'm experimenting with the 3.1
> branch, is there a naming convention for patches for this branch?
>
> -Janne
>
> On May 26, 2011 4:40 PM, "Ryan McKinley" <ry...@gmail.com> wrote:
>> Hi Janne-
>>
>> This sounds excellent.
>>
>>> Now, a couple of questions :
>>> - Is somebody already working on adding multithreading to counting the
>>> facets ?
>>
>> I don't think so
>>
>>> - Is there a reason to not letting the facet counting methods run in
>>> parallel ? i.e. is this a bad idea ?
>>
>> Like most things, there are tradeoffs -- in some instances spawning
>> multiple threads to respond to a query may be a bad idea, but in
>> others it would be great.
>>
>>
>> Can you open a JIRA issue and attach a patch?
>>
>> ryan
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: SOLR multithreaded faceting

Posted by Janne Majaranta <ja...@gmail.com>.
Yeah, sure. I'll add a JIRA issue and a patch later today. I assume the
patch should be for the TRUNK version. Since I'm experimenting with the 3.1
branch, is there a naming convention for patches for this branch?

-Janne
 On May 26, 2011 4:40 PM, "Ryan McKinley" <ry...@gmail.com> wrote:
> Hi Janne-
>
> This sounds excellent.
>
>> Now, a couple of questions :
>> - Is somebody already working on adding multithreading to counting the
>> facets ?
>
> I don't think so
>
>> - Is there a reason to not letting the facet counting methods run in
>> parallel ? i.e. is this a bad idea ?
>
> Like most things, there are tradeoffs -- in some instances spawning
> multiple threads to respond to a query may be a bad idea, but in
> others it would be great.
>
>
> Can you open a JIRA issue and attach a patch?
>
> ryan
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

Re: SOLR multithreaded faceting

Posted by Ryan McKinley <ry...@gmail.com>.
Hi Janne-

This sounds excellent.

> Now, a couple of questions :
> - Is somebody already working on adding multithreading to counting the
> facets ?

I don't think so

> - Is there a reason to not letting the facet counting methods run in
> parallel ? i.e. is this a bad idea ?

Like most things, there are tradeoffs -- in some instances spawning
multiple threads to respond to a query may be a bad idea, but in
others it would be great.


Can you open a JIRA issue and attach a patch?

ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org