You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Amrit Sarkar (JIRA)" <ji...@apache.org> on 2017/03/01 02:53:45 UTC

[jira] [Updated] (SOLR-9530) Add an Atomic Update Processor

     [ https://issues.apache.org/jira/browse/SOLR-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amrit Sarkar updated SOLR-9530:
-------------------------------
    Attachment: SOLR-9530.patch

Considering Noble's and Ishan's suggestions, cooked up a new patch with the following:

1. No solrconfig parameter(s) required for this URP now.

2. The URP will take inline parameters exactly as Noble mentioned:
{code}processor=Atomic&Atomic.my_newfield=add&Atomic.subject=set&Atomic.count_i=inc{code}

3. Both atomic and conventional updates as incoming documents to the URP are allowed.
   a. for atomic updates, the atomic operation in incoming doc should match with the parameters specified in processor call.
   e.g. {"id":"1","title":{"set":"A"}}  |  processor=Atomic&Atomic.title=set

4. After the conversion to atomic-style, latest _version_ will be added in the updated doc. If _version_, not present, send as it is.

5. if the update faces version conflict, retry by fetching latest _version_ from index, updating the SolrInputDoc. Maximum retries set to 5, hardcoded.

6. If the parameters are not sufficient to convert incoming document to atomic-style, abort the update.
e.g. {"id":"1","title":"A"} | processor=Atomic&Atomic.subject=set
there is no point sending this document for update via URP

{noformat}
new file:   solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateProcessorFactory.java
new file:   solr/core/src/test/org/apache/solr/update/processor/AtomicUpdateProcessorFactoryTest.java
{noformat}

Tried to write a test case for multiple threads executing URP simultaneously, but was not able to replicate the scenario exactly. The test-method is commented out in the patch.

> Add an Atomic Update Processor 
> -------------------------------
>
>                 Key: SOLR-9530
>                 URL: https://issues.apache.org/jira/browse/SOLR-9530
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>         Attachments: SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch, SOLR-9530.patch
>
>
> I'd like to explore the idea of adding a new update processor to help ingest partial updates.
> Example use-case - There are two datasets with a common id field. How can I merge both of them at index time?
> Proposed Solution: 
> {code}
> <updateRequestProcessorChain name="atomic">
>   <processor class="solr.processor.AtomicUpdateProcessorFactory">
>     <str name="my_new_field">add</str>
>   </processor>
>   <processor class="solr.LogUpdateProcessorFactory" />
>   <processor class="solr.RunUpdateProcessorFactory" />
> </updateRequestProcessorChain>
> {code}
> So the first JSON dump could be ingested against {{http://localhost:8983/solr/gettingstarted/update/json}}
> And then the second JSON could be ingested against
> {{http://localhost:8983/solr/gettingstarted/update/json?processor=atomic}}
> The Atomic Update Processor could support all the atomic update operations currently supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org