You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Steve Pruitt <bp...@opentext.com> on 2018/01/05 19:41:17 UTC

document colocation

I have two document types that share several fields.  We currently plan a single index for both types.  One of the shared fields contains a value that correlates two document instances, i.e. two documents of the two types has the same value.  The values are random integers.

We would like each correlated document pair to be indexed in the same shard.

Document routing doesn't seem like it will work.

My development cluster has three Solrcloud nodes - 3 shards, 1 replica
As a simple exercise, I defined a collection with implicit routing on the correlated field and ended up with a single shard.  I was primarily curious to see what I would get.

Are there other possibilities to insure two documents sharing a field value can be indexed in the same shard?


Thanks.

-S

Re: document colocation

Posted by Erick Erickson <er...@gmail.com>.
Why do you want to do this? This feels like an XY problem, you're asking
how to do X (colocate the docs) without explaining why it's valuable (the
Y).

I'm skeptical that this buys you enough to be worth the hassle, which is
why I'm asking about Y.

theoretically at least you might be able to use composite ID routing with
your random number that correlates docs, see:

https://lucidworks.com/2014/01/06/multi-level-composite-id-routing-solrcloud/

Warning: Haven't tried this so test it out.

Best,
Erick

On Fri, Jan 5, 2018 at 11:41 AM, Steve Pruitt <bp...@opentext.com> wrote:

> I have two document types that share several fields.  We currently plan a
> single index for both types.  One of the shared fields contains a value
> that correlates two document instances, i.e. two documents of the two types
> has the same value.  The values are random integers.
>
> We would like each correlated document pair to be indexed in the same
> shard.
>
> Document routing doesn't seem like it will work.
>
> My development cluster has three Solrcloud nodes - 3 shards, 1 replica
> As a simple exercise, I defined a collection with implicit routing on the
> correlated field and ended up with a single shard.  I was primarily curious
> to see what I would get.
>
> Are there other possibilities to insure two documents sharing a field
> value can be indexed in the same shard?
>
>
> Thanks.
>
> -S
>