You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Niko Himanen <ni...@gmail.com> on 2019/04/18 10:30:40 UTC

CompositeId router using custom route field for updates and atomic update

Hello,

I came up with a situation with collection created with "router.field" and
using atomic update format for route.field in document that documents were
routed into wrong shard in CompositeIdRouter.

After doing some investigation I noticed that CompositeIdRouter#sliceHash
takes field value used for routing as is, which means that atomic update
format (like set=123) is used as a whole to calculate route hash instead of
just value 123.

I came over this by using field for routing which is never atomically
updated, but I feel like this is still quite nasty feature/bug which is
hard to detect.

Is this a known issue or should I create ticket from it?

Br,

Niko Himanen

Re: CompositeId router using custom route field for updates and atomic update

Posted by Niko Himanen <nh...@alpha-sense.com>.
Hey Andrzej,

Thank you for your response.

In this specific case updated field is not id or unique field, it is just
some random field I am using for routing. So in this case, it may make
sense to be able to route document to new location by updating (atomic or
not) :).

I would think that solution is either to use real value for routing or
throw exception if field is atomically updated and there is a change that
thats why document would end up in wrong shard using current logic.

Anyway. I created ticket for discussion:
https://issues.apache.org/jira/browse/SOLR-13411

On Thu, Apr 18, 2019 at 2:26 PM Andrzej Białecki <
andrzej.bialecki@lucidworks.com> wrote:

> Hi Niko,
>
> Please create a Jira issue, this looks like a bug. It also needs more
> discussion - I’m not convinced we should allow updates (atomic or not) to
> the id field, because (as the name suggests) this field defines the
> identity of the document, and if the identity is modified is it still the
> same document that we should be updating? ;)
>
> > On 18 Apr 2019, at 12:30, Niko Himanen <ni...@gmail.com> wrote:
> >
> > Hello,
> >
> > I came up with a situation with collection created with "router.field"
> and using atomic update format for route.field in document that documents
> were routed into wrong shard in CompositeIdRouter.
> >
> > After doing some investigation I noticed that
> CompositeIdRouter#sliceHash takes field value used for routing as is, which
> means that atomic update format (like set=123) is used as a whole to
> calculate route hash instead of just value 123.
> >
> > I came over this by using field for routing which is never atomically
> updated, but I feel like this is still quite nasty feature/bug which is
> hard to detect.
> >
> > Is this a known issue or should I create ticket from it?
> >
> > Br,
> >
> > Niko Himanen
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

-- 
*Niko Himanen*
Senior Search Engineer
M: +358 504 100 773

AlphaSense  |  www.alpha-sense.com

[image: www.alpha-sense.com] <http://www.alpha-sense.com/>

Re: CompositeId router using custom route field for updates and atomic update

Posted by Andrzej Białecki <an...@lucidworks.com>.
Hi Niko,

Please create a Jira issue, this looks like a bug. It also needs more discussion - I’m not convinced we should allow updates (atomic or not) to the id field, because (as the name suggests) this field defines the identity of the document, and if the identity is modified is it still the same document that we should be updating? ;)  

> On 18 Apr 2019, at 12:30, Niko Himanen <ni...@gmail.com> wrote:
> 
> Hello,
> 
> I came up with a situation with collection created with "router.field" and using atomic update format for route.field in document that documents were routed into wrong shard in CompositeIdRouter. 
> 
> After doing some investigation I noticed that CompositeIdRouter#sliceHash takes field value used for routing as is, which means that atomic update format (like set=123) is used as a whole to calculate route hash instead of just value 123.
> 
> I came over this by using field for routing which is never atomically updated, but I feel like this is still quite nasty feature/bug which is hard to detect.
> 
> Is this a known issue or should I create ticket from it?
> 
> Br,
> 
> Niko Himanen


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org