You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by yriveiro <ya...@gmail.com> on 2013/11/14 16:46:39 UTC

Document routing question.

Hi,

I read this post http://searchhub.org/2013/06/13/solr-cloud-document-routing
and I have some questions.

When a tenant is too large to fit on one shard, we can specify the number of
bit from the shardKey that we want to use.

If we set a doc's key as "tenant1/4!docXXX" we are saying to spread the docs
over the 1/4th of the collection. If the collection has 4 shards this means
that all docs with the same shardKey will go to the same shard, or we will
spread 25% in each shard?

Other question is: at query time, we must configurate shardKeys param as
"shard.keys=tenant1!" or as "shard.keys=tenant1/4!"

/Yago



-----
Best regards
--
View this message in context: http://lucene.472066.n3.nabble.com/Document-routing-question-tp4100938.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Document routing question.

Posted by Yago Riveiro <ya...@gmail.com>.
Joel,

Thanks for the explanation.

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Friday, November 15, 2013 at 2:14 PM, Joel Bernstein wrote:

> Yago,
> 
> Now that I look back at this blog, I see how this can be confusing.
> 
> This is how to breakdown the composite id: tenant1/4!docXXX
> 
> "tenant1" is the shardkey.
> 
> "/" is a separator between the shardkey and bits to use from the shardkey.
> 
> "4" is the number of bits taken from the shardkey to create the composite
> 32 bit hashcode. The other 28 bits come from the unique document ID.
> 
> "!" separates the shardkey from the unique doc ID
> 
> "docXXX" is the unique document ID
> 
> This is taken from the blog:
> 
> "This will take 4 bits from the shard key and 28 bits from the unique doc
> id, spreading the tenant over 1/16th of the shards in the collection.
> 
> 3 bits would spread the tenant over 1/8th of the collection.
> 2 bits would spread the tenant over 1/4th of the collection.
> 1 bit would spread the tenant over 1/2 the collection.
> 0 bits would spread the tenant across the entire collection."
> 
> 
> You do have to specify the bits at query time as well so Solr knows which
> shards to query.
> 
> Joel
> 
> 
> 
> 
> 
> On Thu, Nov 14, 2013 at 10:46 AM, yriveiro <yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)> wrote:
> 
> > Hi,
> > 
> > I read this post
> > http://searchhub.org/2013/06/13/solr-cloud-document-routing
> > and I have some questions.
> > 
> > When a tenant is too large to fit on one shard, we can specify the number
> > of
> > bit from the shardKey that we want to use.
> > 
> > If we set a doc's key as "tenant1/4!docXXX" we are saying to spread the
> > docs
> > over the 1/4th of the collection. If the collection has 4 shards this means
> > that all docs with the same shardKey will go to the same shard, or we will
> > spread 25% in each shard?
> > 
> > Other question is: at query time, we must configurate shardKeys param as
> > "shard.keys=tenant1!" or as "shard.keys=tenant1/4!"
> > 
> > /Yago
> > 
> > 
> > 
> > -----
> > Best regards
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/Document-routing-question-tp4100938.html
> > Sent from the Solr - User mailing list archive at Nabble.com (http://Nabble.com).
> > 
> 
> 
> 
> 
> -- 
> Joel Bernstein
> Search Engineer at Heliosearch
> 
> 



Re: Document routing question.

Posted by Joel Bernstein <jo...@gmail.com>.
Yago,

Now that I look back at this blog, I see how this can be confusing.

This is how to breakdown the composite id: tenant1/4!docXXX

"tenant1" is the shardkey.

"/" is a separator between the shardkey and bits to use from the shardkey.

"4" is the number of bits taken from the shardkey to create the composite
32 bit hashcode. The other 28 bits come from the unique document ID.

"!" separates the shardkey from the unique doc ID

"docXXX" is the unique document ID

This is taken from the blog:

"This will take 4 bits from the shard key and 28 bits from the unique doc
id, spreading the tenant over 1/16th of the shards in the collection.

3 bits would spread the tenant over 1/8th of the collection.
2 bits would spread the tenant over 1/4th of the collection.
1 bit would spread the tenant  over 1/2 the collection.
0 bits would spread the tenant across the entire collection."


You do have to specify the bits at query time as well so Solr knows which
shards to query.

Joel





On Thu, Nov 14, 2013 at 10:46 AM, yriveiro <ya...@gmail.com> wrote:

> Hi,
>
> I read this post
> http://searchhub.org/2013/06/13/solr-cloud-document-routing
> and I have some questions.
>
> When a tenant is too large to fit on one shard, we can specify the number
> of
> bit from the shardKey that we want to use.
>
> If we set a doc's key as "tenant1/4!docXXX" we are saying to spread the
> docs
> over the 1/4th of the collection. If the collection has 4 shards this means
> that all docs with the same shardKey will go to the same shard, or we will
> spread 25% in each shard?
>
> Other question is: at query time, we must configurate shardKeys param as
> "shard.keys=tenant1!" or as "shard.keys=tenant1/4!"
>
> /Yago
>
>
>
> -----
> Best regards
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Document-routing-question-tp4100938.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Joel Bernstein
Search Engineer at Heliosearch