You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marc Sturlese <ma...@gmail.com> on 2008/11/11 12:18:26 UTC

deduplication & dataimporthandler

Hey there,
Is there any way to use dataimporthandler with deduplication together just
doing xml configuration?

I have read that deduplication (http://wiki.apache.org/solr/Deduplication)
is meant to be used with the handler named /update (wich uses
solr.XmlUpdateRequestHandler class).

If there's no other way will go inside the DataImportHandler source but
would like to know if it can be done via conf...
I am thinking in something like adding:

<updateRequestProcessorChain name="dedupe">
    <processor
     
class="org.apache.solr.update.processor.DeduplicateUpdateProcessorFactory">

        <bool name="enabled">true</bool>
        <str name="fields">field1,field2</str>
        <str name="signatureClass">
          org.apache.solr.update.processor.TextProfileSignature
        </str>
        <str name="signatureField">signatureField</str>
 
    </processor>
    <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>

Inside my requesthandler called /dataimport (wich uses
org.apache.solr.handler.dataimport.DataImportHandler class)

Has anyone done something similar?

Marc Sturlese

-- 
View this message in context: http://www.nabble.com/deduplication---dataimporthandler-tp20437553p20437553.html
Sent from the Solr - User mailing list archive at Nabble.com.