You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jean-Sebastien Vachon <je...@wantedanalytics.com> on 2012/04/20 03:04:13 UTC

Solr Cloud vs sharding vs grouping

Hi All,

I am currently trying out SolrCloud on a small cluster and I'm enjoying Solr more than ever. Thanks to all the contributors.

That being said, one very important feature for us is the grouping/collapsing of results on a specific field value on a distributed index. We are currently using Solr 1.4 with Patch 236 and it does the job as long as all documents with a common field value are on the same shard. Otherwise grouping on a distributed index will not work as expected.

I looked everywhere if this limitation was still present in the trunk but found no mention of it.
Is this still a requirement for grouping results on a distributed index?

Thanks


Re: Solr Cloud vs sharding vs grouping

Posted by Martijn v Groningen <ma...@gmail.com>.
Hi Jean-Sebastien,

For some grouping features (like total group count and grouped
faceting), the distributed grouping requires you to partition your
documents into the right shard. Basically groups can't cross shards.
Otherwise the group counts or grouped facet counts may not be correct.
If you use the basic grouping functionality then this limitation
doesn't apply.

I think right now that SolrCloud partitions documents based on the
unique id (id % number_shards). You need to modify this somehow or
maybe do the distributed indexing yourself.

Martijn

On 20 April 2012 12:07, Lance Norskog <go...@gmail.com> wrote:
> The implementation of grouping in the trunk is completely different
> from 236. Grouping works across distributed search:
> https://issues.apache.org/jira/browse/SOLR-2066
>
> committed last September.
>
> On Thu, Apr 19, 2012 at 6:04 PM, Jean-Sebastien Vachon
> <je...@wantedanalytics.com> wrote:
>> Hi All,
>>
>> I am currently trying out SolrCloud on a small cluster and I'm enjoying Solr more than ever. Thanks to all the contributors.
>>
>> That being said, one very important feature for us is the grouping/collapsing of results on a specific field value on a distributed index. We are currently using Solr 1.4 with Patch 236 and it does the job as long as all documents with a common field value are on the same shard. Otherwise grouping on a distributed index will not work as expected.
>>
>> I looked everywhere if this limitation was still present in the trunk but found no mention of it.
>> Is this still a requirement for grouping results on a distributed index?
>>
>> Thanks
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Solr Cloud vs sharding vs grouping

Posted by Lance Norskog <go...@gmail.com>.
The implementation of grouping in the trunk is completely different
from 236. Grouping works across distributed search:
https://issues.apache.org/jira/browse/SOLR-2066

committed last September.

On Thu, Apr 19, 2012 at 6:04 PM, Jean-Sebastien Vachon
<je...@wantedanalytics.com> wrote:
> Hi All,
>
> I am currently trying out SolrCloud on a small cluster and I'm enjoying Solr more than ever. Thanks to all the contributors.
>
> That being said, one very important feature for us is the grouping/collapsing of results on a specific field value on a distributed index. We are currently using Solr 1.4 with Patch 236 and it does the job as long as all documents with a common field value are on the same shard. Otherwise grouping on a distributed index will not work as expected.
>
> I looked everywhere if this limitation was still present in the trunk but found no mention of it.
> Is this still a requirement for grouping results on a distributed index?
>
> Thanks
>



-- 
Lance Norskog
goksron@gmail.com