You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Alex Deparvu (Jira)" <ji...@apache.org> on 2022/10/15 02:19:00 UTC

[jira] [Commented] (SOLR-16160) UpdateXmlMessages duplicate data when data is removed and then added in the same message

    [ https://issues.apache.org/jira/browse/SOLR-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618003#comment-17618003 ] 

Alex Deparvu commented on SOLR-16160:
-------------------------------------

I put together a PR for this bug https://github.com/apache/solr/pull/1078

> UpdateXmlMessages duplicate data when data is removed and then added in the same message
> ----------------------------------------------------------------------------------------
>
>                 Key: SOLR-16160
>                 URL: https://issues.apache.org/jira/browse/SOLR-16160
>             Project: Solr
>          Issue Type: Bug
>          Components: search, update
>    Affects Versions: 8.11.1
>            Reporter: Nick Hadder
>            Priority: Blocker
>         Attachments: image-2022-04-20-10-34-08-573.png, image-2022-04-20-10-35-05-247.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Replication Steps*
> 1. Have two multi-value fields with the following schema 
> {code:java}
> <field name="docTags" type="plongs" multiValued="true" indexed="true" stored="true"/><field name="tg0001" type="ipro_strings" multiValued="true" indexed="true" stored="true"/>
> <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
> <fieldType name="ipro_strings" class="solr.TextField" sortMissingLast="true" multiValued="true">
> <analyzer>
> <tokenizer class="solr.KeywordTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> </analyzer>
> </fieldType> 
> {code}
> 2. Execute the following UpdateXmlMessage
> {code:java}
> <add commitWithin="1000">
> <doc>
> <field name="_id">1</field>
> <field name="docTags" update="remove"><![CDATA[1]]></field>
> <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
> <field name="docTags" update="remove"><![CDATA[4]]></field>
> <field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
> <field name="docTags" update="remove"><![CDATA[6]]></field>
> <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
> </doc>
> </add>
> <add commitWithin="1000">
> <doc>
> <field name="_id">1</field>
> <field name="docTags" update="remove"><![CDATA[6]]></field>
> <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
> <field name="docTags" update="add-distinct"><![CDATA[1]]></field>
> <field name="tg0001" update="add-distinct"><![CDATA[Convert to Image]]></field>
> <field name="docTags" update="add-distinct"><![CDATA[4]]></field>
> <field name="tg0001" update="add-distinct"><![CDATA[Large Files]]></field>
> </doc>
> </add>
> <add commitWithin="1000">
> <doc>
> <field name="_id">1</field>
> <field name="docTags" update="remove"><![CDATA[1]]></field>
> <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
> <field name="docTags" update="remove"><![CDATA[4]]></field>
> <field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
> <field name="docTags" update="add-distinct"><![CDATA[6]]></field>
> <field name="tg0001" update="add-distinct"><![CDATA[To Bulk-Print]]></field>
> </doc>
> </add> {code}
> 3. Observe the following defect of duplicate values in those fields for that document
> !image-2022-04-20-10-35-05-247.png!
> *Note:* If you add the data first in the Xml message and the update="remove" tags at the bottom, it works as expected and only adds once instance of the data from the above update="add-distinct" message. The issue only occurs if the remove tags come before the add-distinct tags.
>  
> Is this because of some undocumented order the updates need to be in or is it a true defect that it is not working as expected?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org