You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hemanth <k....@gmail.com> on 2017/12/15 19:17:40 UTC

Re: indexing data to solrcloud with "implicit" is not distributing across cluster.

I created a collection with implicit routing mechanism and my shared names
are Active and Disabled , these are the values of one of my collection
field: Status.  But when I am trying to upload the document using Solr UI
documents section : Upload using JSON format with all the fields including
field with value for Status as either Terminated or Active. It is going to
only one default shard. I tried to insert _route_ field with the value as
"Terminated" and when I try to insert the document , I am getting  

*unknown field '_route_' Error from server*. Am I trying in correct way?
Does the implicit routing works on the hash value of routing field and it
does not go to the shard based on the value of the routing field? 

I want to store the document with status field value : Active to
myCollectionn_Active shard and document with status field value: Terminated
to myCollection_Terminated shard automatically based on the value of my
status field in the document. I used implicit routing while creating
collection and given shard names as Active,Terminated. Plz help. I am using
Solr 6.6 version.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: indexing data to solrcloud with "implicit" is not distributing across cluster.

Posted by hemanth <k....@gmail.com>.
As per my understanding, distrib=false will be added in select query to
restrict the document selection to particular shard. But how should i route
the documents to only particular shard, is still my need.

Thanks
Hemanth



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: indexing data to solrcloud with "implicit" is not distributing across cluster.

Posted by Erick Erickson <er...@gmail.com>.
I suspect that when you create your collections, somehow you're not
doing it like you expect.

The red flag is:

I tried creating a collection with compositeId routing which
created shard1,shard2,shard3 , but when I indexed , all the documents went
to one shard only

This simply shouldn't be happening. What is your evidence that all the docs went
to one shard? You can tell by adding &distrib=false to your query and
sending it to
particular core, something like:

solr_server/solr/collection1_shard1_replica1/query?q=*:*&distrib=false.

Best,
Erick

On Mon, Dec 25, 2017 at 4:15 AM, hemanth <k....@gmail.com> wrote:
> Hi Erik,
> Thanks for your reply. I have no issues of using either Implicit or
> Composite routing but I want to insert the documents to a particular shard,
> so that when I want to query the data , I can hit a particular shard, which
> gives me the results in lesser time as it hits only particular shard. So,
> for eg: I am creating a collection with status as Active, Inactive and
> Terminated. Let me think that my data at present is equally distributed ,
> i.e Active 400 records, Inactive 300 records and Terminated also 300
> records. I tried creating a collection with compositeId routing which
> created shard1,shard2,shard3 , but when I indexed , all the documents went
> to one shard only. I also created a collection with Implicit routing
> mechanism with Active,Inactive and Terminated shard with routing key as
> status. When I indexed the documents , again all went to single shard. I
> want to route the documents based on some input value (with out based on the
> hash value of the field , I specified, because both values may always lead
> to same hash value and may point to store in same shard).  So , Please let
> me know, how to route the documents to a particular shard based on composite
> id or implicit mechanism, by using one of the existing field value or
> extracting the content of the field before ! parameter. eg: if my field
> value is "Active!otherfieldvalue" should go to Active shard and if my field
> value is  "Inactive!othercontent" should go to Inactive shard.
>
> Thanks
> Hemanth
>
> -Happy Christmas
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: indexing data to solrcloud with "implicit" is not distributing across cluster.

Posted by hemanth <k....@gmail.com>.
Hi Erik,
Thanks for your reply. I have no issues of using either Implicit or
Composite routing but I want to insert the documents to a particular shard,
so that when I want to query the data , I can hit a particular shard, which
gives me the results in lesser time as it hits only particular shard. So,
for eg: I am creating a collection with status as Active, Inactive and
Terminated. Let me think that my data at present is equally distributed ,
i.e Active 400 records, Inactive 300 records and Terminated also 300
records. I tried creating a collection with compositeId routing which
created shard1,shard2,shard3 , but when I indexed , all the documents went
to one shard only. I also created a collection with Implicit routing
mechanism with Active,Inactive and Terminated shard with routing key as
status. When I indexed the documents , again all went to single shard. I
want to route the documents based on some input value (with out based on the
hash value of the field , I specified, because both values may always lead
to same hash value and may point to store in same shard).  So , Please let
me know, how to route the documents to a particular shard based on composite
id or implicit mechanism, by using one of the existing field value or
extracting the content of the field before ! parameter. eg: if my field
value is "Active!otherfieldvalue" should go to Active shard and if my field
value is  "Inactive!othercontent" should go to Inactive shard.

Thanks
Hemanth

-Happy Christmas



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: indexing data to solrcloud with "implicit" is not distributing across cluster.

Posted by Erick Erickson <er...@gmail.com>.
You're misinterpreting the docs. _route_ is used to
tell _queries_ where to go, or to route a document
as part of the parameters when you send the doc,
not a field in the doc.

So when you added the _route_ field to the doc, you
didn't have it in the schema in the first place.

So you could add a _route_ field to your schema
and work that way, but then you have to also define
router.field=_route_ when you create the colleciton.
I'd advise instead just specifying router.field=Status
to avoid confusion.

Now, that said I really question whether this is a good
way to set up your collection. I'd just use compositeId
and when you want to restrict searches to one type
or the other add
&fq=Status:Active
or
&fq=Status:Terminated

that way you can't forget to delete the doc from one
shard or the other when the status changes. You won't
have lopsided doc counts on your shards because you
have 10,000,000 active docs and 10 terminated docs.
And whatever ratio you start with, it'll change as the
collection ages.

FWIW,
Erick

On Fri, Dec 15, 2017 at 11:17 AM, hemanth <k....@gmail.com> wrote:
> I created a collection with implicit routing mechanism and my shared names
> are Active and Disabled , these are the values of one of my collection
> field: Status.  But when I am trying to upload the document using Solr UI
> documents section : Upload using JSON format with all the fields including
> field with value for Status as either Terminated or Active. It is going to
> only one default shard. I tried to insert _route_ field with the value as
> "Terminated" and when I try to insert the document , I am getting
>
> *unknown field '_route_' Error from server*. Am I trying in correct way?
> Does the implicit routing works on the hash value of routing field and it
> does not go to the shard based on the value of the routing field?
>
> I want to store the document with status field value : Active to
> myCollectionn_Active shard and document with status field value: Terminated
> to myCollection_Terminated shard automatically based on the value of my
> status field in the document. I used implicit routing while creating
> collection and given shard names as Active,Terminated. Plz help. I am using
> Solr 6.6 version.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html