You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by paule_lecuyer <le...@bertin.fr> on 2013/11/25 16:02:17 UTC

Solr 4.x : how to implement an update processor chain working for partial updates

In my solr schema I have the following fields defined : 

      <field name="content" type="text_general" indexed="false"
stored="true" multiValued="true" />
      <field name="all" type="text_general" indexed="true" stored="false"
multiValued="true" termVectors="true" />
      <field name="eng" type="text_en" indexed="true" stored="false"
multiValued="true" termVectors="true" />
      <field name="ita" type="text_it" indexed="true" stored="false"
multiValued="true" termVectors="true" />
      <field name="fre" type="text_fr" indexed="true" stored="false"
multiValued="true" termVectors="true" />
      ...
        <copyField source="content" dest="all"/>

To fill in the language specific fields, I use a custom update processor
chain, with a custom ConditionalCopyProcessor that copies "content" field
into appropriate language field, depending on document language (as
explained in http://wiki.apache.org/solr/UpdateRequestProcessor).

Problem is this custom chain is applied on the document passed to the update
request, thus it works all right when inserting a new document or updating
the whole document, where all fields are provided, but it does not when
passed document holds only updated fields (as language-specific fields are
not stored).

I would avoid to set language specific fields to stored="true", as "content"
field may hold big values. 

Is there a way to have solr execute my ConditionalCopyProcessor on the
actual updated doc (the one resulting from solr retrieving all stored values
and merging with update request values), and not on the request doc ?

Thank a lot for your help.

Paule



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-x-how-to-implement-an-update-processor-chain-working-for-partial-updates-tp4103071.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4.x : how to implement an update processor chain working for partial updates

Posted by paule_lecuyer <le...@bertin.fr>.
Hi Alexandre,

Reading comments on SOLR-5395, I don't fully understand how the whole update
processing is performed (what exactly happens on which data). if you have
some time to explain that would be very nice.
Thanks
Paule



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-x-how-to-implement-an-update-processor-chain-working-for-partial-updates-tp4103071p4103741.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4.x : how to implement an update processor chain working for partial updates

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
SOLR-5395 just out with 4.6 might have some relevance here (RunAlways
marker interface for UpdateRequestProcessorFactory). Not sure how it
affects partial updates though.

Regards,
   Alex.

Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Tue, Nov 26, 2013 at 1:44 AM, Chris Hostetter
<ho...@fucit.org>wrote:

> :
> : Is there a way to have solr execute my ConditionalCopyProcessor on the
> : actual updated doc (the one resulting from solr retrieving all stored
> values
> : and merging with update request values), and not on the request doc ?
>
> Partial Updates, and loading the existing stored fields of a document
> that is being partially updated, happens in the DistributedUpdateProcessor
> as part of hte leader logic (so that we can be confident we have the
> correct field values and _version_ info even if there are competing
> updates to the same document)
>
> if you configure your update processor to happen *after* the
> DistributedUpdateProcessor, then the document will be fuly populated --
> unfortunatly.  the down side however is that your processorwill be run
> redundently on each replica, which can be anoying if it's a resource
> intensive update processor or requires hitting an external resource.
>
> NOTE: even if you aren't using SolrCloud, you still get an implicit
> instance of DistributedUpdateProcessor precisely so that partial updates
> will work...
>
> https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates
>
>
>
> -Hoss
>

Re: Solr 4.x : how to implement an update processor chain working for partial updates

Posted by paule_lecuyer <le...@bertin.fr>.
It works just as I wanted, thanks !!
Paule



--
View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-x-how-to-implement-an-update-processor-chain-working-for-partial-updates-tp4103071p4103738.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 4.x : how to implement an update processor chain working for partial updates

Posted by Chris Hostetter <ho...@fucit.org>.
: 
: Is there a way to have solr execute my ConditionalCopyProcessor on the
: actual updated doc (the one resulting from solr retrieving all stored values
: and merging with update request values), and not on the request doc ?

Partial Updates, and loading the existing stored fields of a document 
that is being partially updated, happens in the DistributedUpdateProcessor 
as part of hte leader logic (so that we can be confident we have the 
correct field values and _version_ info even if there are competing 
updates to the same document)

if you configure your update processor to happen *after* the 
DistributedUpdateProcessor, then the document will be fuly populated -- 
unfortunatly.  the down side however is that your processorwill be run 
redundently on each replica, which can be anoying if it's a resource 
intensive update processor or requires hitting an external resource.

NOTE: even if you aren't using SolrCloud, you still get an implicit 
instance of DistributedUpdateProcessor precisely so that partial updates 
will work...

https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates



-Hoss