You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ali Nazemian <al...@gmail.com> on 2014/10/06 09:40:55 UTC
duplicate unique key after partial update in solr 4.10
Dear all,
Hi,
I am going to do partial update on a field that has not any value. Suppose
I have a document with document id (unique key) '12345' and field
"read_flag" which does not index at the first place. So the read_flag field
for this document has not any value. After I did partial update to this
document to set "read_flag"="true", I faced strange problem. Next time I
indexed same document with same values I saw two different version of
document with id '12345' in solr. One of them with read_flag=true and
another one without read_flag field! I dont want to have duplicate
documents (as it should not to be because of unique_key id). Would you
please tell me what caused such problem?
Best regards.
--
A.Nazemian
Re: duplicate unique key after partial update in solr 4.10
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
It seems like by-design
https://issues.apache.org/jira/browse/SOLR-5211
you can't update a parent doc from the block.
On Tue, Oct 7, 2014 at 9:44 AM, Ali Nazemian <al...@gmail.com> wrote:
> The list of docs before do partial update:
> <doc>
> <field name="id">product01</field>
> <field name="name">car</field>
> <field name="content_type">product</field>
> <doc>
> <field name="id">part01</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part02</field>
> <field name="name">engine</field>
> <field name="doctype">part</field>
> </doc>
> <doc>
> <field name="id">part03</field>
> <field name="name">brakes</field>
> <field name="content_type">part</field>
> </doc>
> </doc>
> <doc>
> <field name="id">product02</field>
> <field name="name">truck</field>
> <field name="content_type">product</field>
> <doc>
> <field name="id">part04</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part05</field>
> <field name="name">flaps</field>
> <field name="doctype">part</field>
> </doc>
> </doc>
>
> The list of docs after doing partial update of field read_flag for document
> "product01":
> <doc>
> <field name="id">product01</field>
> <field name="name">car</field>
> <field name="content_type">product</field>
> <field name="read_flag">true</field>
> <doc>
> <field name="id">part01</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part02</field>
> <field name="name">engine</field>
> <field name="doctype">part</field>
> </doc>
> <doc>
> <field name="id">part03</field>
> <field name="name">brakes</field>
> <field name="content_type">part</field>
> </doc>
> </doc>
> <doc>
> <field name="id">product02</field>
> <field name="name">truck</field>
> <field name="content_type">product</field>
> <doc>
> <field name="id">part04</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part05</field>
> <field name="name">flaps</field>
> <field name="doctype">part</field>
> </doc>
> </doc>
>
> The list of documents after sending same documents again. (it should
> overwrite on the last one because of duplicate IDs)
> <doc>
> <field name="id">product01</field>
> <field name="name">car</field>
> <field name="content_type">product</field>
> <field name="read_flag">true</field>
> </doc>
> <doc>
> <field name="id">product01</field>
> <field name="name">car</field>
> <field name="content_type">product</field>
> <doc>
> <field name="id">part01</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part02</field>
> <field name="name">engine</field>
> <field name="doctype">part</field>
> </doc>
> <doc>
> <field name="id">part03</field>
> <field name="name">brakes</field>
> <field name="content_type">part</field>
> </doc>
> </doc>
> <doc>
> <field name="id">product02</field>
> <field name="name">truck</field>
> <field name="content_type">product</field>
> <doc>
> <field name="id">part04</field>
> <field name="name">wheels</field>
> <field name="content_type">part</field>
> </doc>
> <doc>
> <field name="id">part05</field>
> <field name="name">flaps</field>
> <field name="doctype">part</field>
> </doc>
> </doc>
>
> But as you can see there are two different version of documents with the
> same ID (which is product01).
>
> Regards.
>
> On Mon, Oct 6, 2014 at 8:18 PM, Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>
> > Can you upload the update documents then (into a Gist or similar).
> > Just so that people didn't have to re-imagine exact steps. Because, if
> > it fully checks out, it might be a bug and the next step would be
> > creating a JIRA ticket.
> >
> > Regards,
> > Alex.
> > Personal: http://www.outerthoughts.com/ and @arafalov
> > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> >
> >
> > On 6 October 2014 11:23, Ali Nazemian <al...@gmail.com> wrote:
> > > Dear Alex,
> > > Hi,
> > > LOL, yeah I am sure. You can test it yourself. I did that on default
> > schema
> > > too. The results are same!
> > > Regards.
> > >
> > > On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <
> > arafalov@gmail.com>
> > > wrote:
> > >
> > >> A stupid question: Are you sure that what schema thinks your uniqueId
> > >> is - is the uniqueId in your setup? Also, that you are not somehow
> > >> using the flags to tell Solr to ignore duplicates?
> > >>
> > >> Regards,
> > >> Alex.
> > >> Personal: http://www.outerthoughts.com/ and @arafalov
> > >> Solr resources and newsletter: http://www.solr-start.com/ and
> > @solrstart
> > >> Solr popularizers community:
> > https://www.linkedin.com/groups?gid=6713853
> > >>
> > >>
> > >> On 6 October 2014 03:40, Ali Nazemian <al...@gmail.com> wrote:
> > >> > Dear all,
> > >> > Hi,
> > >> > I am going to do partial update on a field that has not any value.
> > >> Suppose
> > >> > I have a document with document id (unique key) '12345' and field
> > >> > "read_flag" which does not index at the first place. So the
> read_flag
> > >> field
> > >> > for this document has not any value. After I did partial update to
> > this
> > >> > document to set "read_flag"="true", I faced strange problem. Next
> > time I
> > >> > indexed same document with same values I saw two different version
> of
> > >> > document with id '12345' in solr. One of them with read_flag=true
> and
> > >> > another one without read_flag field! I dont want to have duplicate
> > >> > documents (as it should not to be because of unique_key id). Would
> you
> > >> > please tell me what caused such problem?
> > >> > Best regards.
> > >> >
> > >> > --
> > >> > A.Nazemian
> > >>
> > >
> > >
> > >
> > > --
> > > A.Nazemian
> >
>
>
>
> --
> A.Nazemian
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>
Re: duplicate unique key after partial update in solr 4.10
Posted by Ali Nazemian <al...@gmail.com>.
The list of docs before do partial update:
<doc>
<field name="id">product01</field>
<field name="name">car</field>
<field name="content_type">product</field>
<doc>
<field name="id">part01</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part02</field>
<field name="name">engine</field>
<field name="doctype">part</field>
</doc>
<doc>
<field name="id">part03</field>
<field name="name">brakes</field>
<field name="content_type">part</field>
</doc>
</doc>
<doc>
<field name="id">product02</field>
<field name="name">truck</field>
<field name="content_type">product</field>
<doc>
<field name="id">part04</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part05</field>
<field name="name">flaps</field>
<field name="doctype">part</field>
</doc>
</doc>
The list of docs after doing partial update of field read_flag for document
"product01":
<doc>
<field name="id">product01</field>
<field name="name">car</field>
<field name="content_type">product</field>
<field name="read_flag">true</field>
<doc>
<field name="id">part01</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part02</field>
<field name="name">engine</field>
<field name="doctype">part</field>
</doc>
<doc>
<field name="id">part03</field>
<field name="name">brakes</field>
<field name="content_type">part</field>
</doc>
</doc>
<doc>
<field name="id">product02</field>
<field name="name">truck</field>
<field name="content_type">product</field>
<doc>
<field name="id">part04</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part05</field>
<field name="name">flaps</field>
<field name="doctype">part</field>
</doc>
</doc>
The list of documents after sending same documents again. (it should
overwrite on the last one because of duplicate IDs)
<doc>
<field name="id">product01</field>
<field name="name">car</field>
<field name="content_type">product</field>
<field name="read_flag">true</field>
</doc>
<doc>
<field name="id">product01</field>
<field name="name">car</field>
<field name="content_type">product</field>
<doc>
<field name="id">part01</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part02</field>
<field name="name">engine</field>
<field name="doctype">part</field>
</doc>
<doc>
<field name="id">part03</field>
<field name="name">brakes</field>
<field name="content_type">part</field>
</doc>
</doc>
<doc>
<field name="id">product02</field>
<field name="name">truck</field>
<field name="content_type">product</field>
<doc>
<field name="id">part04</field>
<field name="name">wheels</field>
<field name="content_type">part</field>
</doc>
<doc>
<field name="id">part05</field>
<field name="name">flaps</field>
<field name="doctype">part</field>
</doc>
</doc>
But as you can see there are two different version of documents with the
same ID (which is product01).
Regards.
On Mon, Oct 6, 2014 at 8:18 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:
> Can you upload the update documents then (into a Gist or similar).
> Just so that people didn't have to re-imagine exact steps. Because, if
> it fully checks out, it might be a bug and the next step would be
> creating a JIRA ticket.
>
> Regards,
> Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 6 October 2014 11:23, Ali Nazemian <al...@gmail.com> wrote:
> > Dear Alex,
> > Hi,
> > LOL, yeah I am sure. You can test it yourself. I did that on default
> schema
> > too. The results are same!
> > Regards.
> >
> > On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <
> arafalov@gmail.com>
> > wrote:
> >
> >> A stupid question: Are you sure that what schema thinks your uniqueId
> >> is - is the uniqueId in your setup? Also, that you are not somehow
> >> using the flags to tell Solr to ignore duplicates?
> >>
> >> Regards,
> >> Alex.
> >> Personal: http://www.outerthoughts.com/ and @arafalov
> >> Solr resources and newsletter: http://www.solr-start.com/ and
> @solrstart
> >> Solr popularizers community:
> https://www.linkedin.com/groups?gid=6713853
> >>
> >>
> >> On 6 October 2014 03:40, Ali Nazemian <al...@gmail.com> wrote:
> >> > Dear all,
> >> > Hi,
> >> > I am going to do partial update on a field that has not any value.
> >> Suppose
> >> > I have a document with document id (unique key) '12345' and field
> >> > "read_flag" which does not index at the first place. So the read_flag
> >> field
> >> > for this document has not any value. After I did partial update to
> this
> >> > document to set "read_flag"="true", I faced strange problem. Next
> time I
> >> > indexed same document with same values I saw two different version of
> >> > document with id '12345' in solr. One of them with read_flag=true and
> >> > another one without read_flag field! I dont want to have duplicate
> >> > documents (as it should not to be because of unique_key id). Would you
> >> > please tell me what caused such problem?
> >> > Best regards.
> >> >
> >> > --
> >> > A.Nazemian
> >>
> >
> >
> >
> > --
> > A.Nazemian
>
--
A.Nazemian
Re: duplicate unique key after partial update in solr 4.10
Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Can you upload the update documents then (into a Gist or similar).
Just so that people didn't have to re-imagine exact steps. Because, if
it fully checks out, it might be a bug and the next step would be
creating a JIRA ticket.
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
On 6 October 2014 11:23, Ali Nazemian <al...@gmail.com> wrote:
> Dear Alex,
> Hi,
> LOL, yeah I am sure. You can test it yourself. I did that on default schema
> too. The results are same!
> Regards.
>
> On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>
>> A stupid question: Are you sure that what schema thinks your uniqueId
>> is - is the uniqueId in your setup? Also, that you are not somehow
>> using the flags to tell Solr to ignore duplicates?
>>
>> Regards,
>> Alex.
>> Personal: http://www.outerthoughts.com/ and @arafalov
>> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
>> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>>
>>
>> On 6 October 2014 03:40, Ali Nazemian <al...@gmail.com> wrote:
>> > Dear all,
>> > Hi,
>> > I am going to do partial update on a field that has not any value.
>> Suppose
>> > I have a document with document id (unique key) '12345' and field
>> > "read_flag" which does not index at the first place. So the read_flag
>> field
>> > for this document has not any value. After I did partial update to this
>> > document to set "read_flag"="true", I faced strange problem. Next time I
>> > indexed same document with same values I saw two different version of
>> > document with id '12345' in solr. One of them with read_flag=true and
>> > another one without read_flag field! I dont want to have duplicate
>> > documents (as it should not to be because of unique_key id). Would you
>> > please tell me what caused such problem?
>> > Best regards.
>> >
>> > --
>> > A.Nazemian
>>
>
>
>
> --
> A.Nazemian
Re: duplicate unique key after partial update in solr 4.10
Posted by Ali Nazemian <al...@gmail.com>.
Dear Alex,
Hi,
LOL, yeah I am sure. You can test it yourself. I did that on default schema
too. The results are same!
Regards.
On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <ar...@gmail.com>
wrote:
> A stupid question: Are you sure that what schema thinks your uniqueId
> is - is the uniqueId in your setup? Also, that you are not somehow
> using the flags to tell Solr to ignore duplicates?
>
> Regards,
> Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 6 October 2014 03:40, Ali Nazemian <al...@gmail.com> wrote:
> > Dear all,
> > Hi,
> > I am going to do partial update on a field that has not any value.
> Suppose
> > I have a document with document id (unique key) '12345' and field
> > "read_flag" which does not index at the first place. So the read_flag
> field
> > for this document has not any value. After I did partial update to this
> > document to set "read_flag"="true", I faced strange problem. Next time I
> > indexed same document with same values I saw two different version of
> > document with id '12345' in solr. One of them with read_flag=true and
> > another one without read_flag field! I dont want to have duplicate
> > documents (as it should not to be because of unique_key id). Would you
> > please tell me what caused such problem?
> > Best regards.
> >
> > --
> > A.Nazemian
>
--
A.Nazemian
Re: duplicate unique key after partial update in solr 4.10
Posted by Alexandre Rafalovitch <ar...@gmail.com>.
A stupid question: Are you sure that what schema thinks your uniqueId
is - is the uniqueId in your setup? Also, that you are not somehow
using the flags to tell Solr to ignore duplicates?
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
On 6 October 2014 03:40, Ali Nazemian <al...@gmail.com> wrote:
> Dear all,
> Hi,
> I am going to do partial update on a field that has not any value. Suppose
> I have a document with document id (unique key) '12345' and field
> "read_flag" which does not index at the first place. So the read_flag field
> for this document has not any value. After I did partial update to this
> document to set "read_flag"="true", I faced strange problem. Next time I
> indexed same document with same values I saw two different version of
> document with id '12345' in solr. One of them with read_flag=true and
> another one without read_flag field! I dont want to have duplicate
> documents (as it should not to be because of unique_key id). Would you
> please tell me what caused such problem?
> Best regards.
>
> --
> A.Nazemian