You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hemanth <k....@gmail.com> on 2018/01/02 10:27:22 UTC

RE: How to routing document for send to particular shard range

Hi Ketan,

I also tried various ways to route documents to different shards based on
some routing key value. eg:  status: active,inactive and terminated should
go to 3 different shards. I tried creating implicit as well as composite id
routers. I could not route the documents to the shard I want. Only thing
which we can achieve is , documents will be routed based on the hash values
of the field values. This will do automatically and it will not help to
manually route to the shard we need. The api documents looks little fuzzy
and I think solr will not route the documents to the desired shard manually.
I am referring 6.6 version. I also tried creating some dummy "_route_" field
and copied my status to this field and tried. But no luck. By any chance if
you got the solution. Please let me know. I think , it will be one of the
important feature , that can be enhanced. Creating different collections ,
just for the difference of one field is of not good option. for eg: if we
have sales documents, we want to partition them by sales country. i.e USA
sales in one shard and Canada sales in one shard etc.. For this case , we
need one collection with many shards and each shard should contain the data
only to that particular shard.

Thanks
Hemanth




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to routing document for send to particular shard range

Posted by Erick Erickson <er...@gmail.com>.
bq: Only thing which we can achieve is , documents will be routed
based on the hash values of the field values.

Then you have created your collection with compositeID routing or have
some other misconfiguration. You _must_
create your collection with "router.name=implicit".

Rather than _tell_ us what you're doing, please _show_.
1> the exact command you use to create your collection
2> the results of the collections API CLUSTERSTATUS command:
https://lucene.apache.org/solr/guide/6_6/collections-api.html
3> An example document and where you think it should be routed.
4> Where it actually ends up.

Again, you use the "active" tag as the both value of the route field and
the name of the shard. You can name the shard as you choose of course.

This works as I expect (Solr 6.3)

Create command:
localhost:8983/solr/admin/collections?action=CREATE&name=eoe&router.name=implicit&router.field=rfield&collection.configName=eoe&shards=active,inactive,terminated

rfield definition (not sure whether stored="true" or indexed="true"
are required)
<field name="rfield" type="string" indexed="true" stored="true"
required="true" multiValued="false" />


Query to check whether a doc is on a specific shard, note
&distrib=false specifically restricts query to the core indicated:
http://localhost:8983/solr/eoe_terminated_replica1/query?q=*:*&distrib=false

Example XML docs:
<add>
<doc>
  <field name="id">doc1</field>
  <field name="rfield">active</field>
</doc>

<doc>
  <field name="id">doc2</field>
  <field name="rfield">inactive</field>
</doc>

<doc>
  <field name="id">doc3</field>
  <field name="rfield">terminated</field>
</doc>

</add>


Best,
Erick

On Tue, Jan 2, 2018 at 5:39 AM, Susheel Kumar <su...@gmail.com> wrote:
> Hi Ketan,
>
> I believe you need multiple shard looking the count 800M.  How much will be
> the index size?   Assume it comes out to 400G and assume your VM/machines
> has 64GB and practically you want to fit your index into memory for each
> shard... With that I would create 10shards on 10 machines (40 GB index on
> each with some buffer for growth).  Also utilize _route_ parameter for your
> queries to be faster.
>
> Thnx
>
> On Tue, Jan 2, 2018 at 5:27 AM, hemanth <k....@gmail.com> wrote:
>
>> Hi Ketan,
>>
>> I also tried various ways to route documents to different shards based on
>> some routing key value. eg:  status: active,inactive and terminated should
>> go to 3 different shards. I tried creating implicit as well as composite id
>> routers. I could not route the documents to the shard I want. Only thing
>> which we can achieve is , documents will be routed based on the hash values
>> of the field values. This will do automatically and it will not help to
>> manually route to the shard we need. The api documents looks little fuzzy
>> and I think solr will not route the documents to the desired shard
>> manually.
>> I am referring 6.6 version. I also tried creating some dummy "_route_"
>> field
>> and copied my status to this field and tried. But no luck. By any chance if
>> you got the solution. Please let me know. I think , it will be one of the
>> important feature , that can be enhanced. Creating different collections ,
>> just for the difference of one field is of not good option. for eg: if we
>> have sales documents, we want to partition them by sales country. i.e USA
>> sales in one shard and Canada sales in one shard etc.. For this case , we
>> need one collection with many shards and each shard should contain the data
>> only to that particular shard.
>>
>> Thanks
>> Hemanth
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Re: How to routing document for send to particular shard range

Posted by Susheel Kumar <su...@gmail.com>.
Hi Ketan,

I believe you need multiple shard looking the count 800M.  How much will be
the index size?   Assume it comes out to 400G and assume your VM/machines
has 64GB and practically you want to fit your index into memory for each
shard... With that I would create 10shards on 10 machines (40 GB index on
each with some buffer for growth).  Also utilize _route_ parameter for your
queries to be faster.

Thnx

On Tue, Jan 2, 2018 at 5:27 AM, hemanth <k....@gmail.com> wrote:

> Hi Ketan,
>
> I also tried various ways to route documents to different shards based on
> some routing key value. eg:  status: active,inactive and terminated should
> go to 3 different shards. I tried creating implicit as well as composite id
> routers. I could not route the documents to the shard I want. Only thing
> which we can achieve is , documents will be routed based on the hash values
> of the field values. This will do automatically and it will not help to
> manually route to the shard we need. The api documents looks little fuzzy
> and I think solr will not route the documents to the desired shard
> manually.
> I am referring 6.6 version. I also tried creating some dummy "_route_"
> field
> and copied my status to this field and tried. But no luck. By any chance if
> you got the solution. Please let me know. I think , it will be one of the
> important feature , that can be enhanced. Creating different collections ,
> just for the difference of one field is of not good option. for eg: if we
> have sales documents, we want to partition them by sales country. i.e USA
> sales in one shard and Canada sales in one shard etc.. For this case , we
> need one collection with many shards and each shard should contain the data
> only to that particular shard.
>
> Thanks
> Hemanth
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>