You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2013/02/20 23:00:43 UTC

[Solr Wiki] Update of "TextProfileSignature" by EustacheFelenc

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "TextProfileSignature" page has been changed by EustacheFelenc:
http://wiki.apache.org/solr/TextProfileSignature?action=diff&rev1=4&rev2=5

  
  TextProfileSignature operates on raw text, without the filtering provided by Analyzers, and hence will fail to ignore HTML, normalize for diacritics, word stem/semantics, or incorporate the relative importance of different tokens, etc. It also considers only the bag of words, ignoring any word order.
  
+ == Configuration ==
+ 
+ === solrconfig.xml ===
+ 
+ Example settings:
+ {{{
+   <!-- An example dedup update processor that creates the "id" field on the fly
+        based on the hash code of some other fields.  This example has overwriteDupes
+        set to false since we are using the id field as the signatureField and Solr
+        will maintain uniqueness based on that anyway. -->
+   <updateRequestProcessorChain name="dedupe">
+     <processor class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
+       <bool name="enabled">true</bool>
+       <bool name="overwriteDupes">false</bool>
+       <str name="signatureField">id</str>
+       <str name="fields">name,features,cat</str>
+       <str name="signatureClass">org.apache.solr.update.processor.TextProfileSignature</str>
+       <str name="quantRate">.2</str>
+     </processor>
+     <processor class="solr.LogUpdateProcessorFactory" />
+     <processor class="solr.RunUpdateProcessorFactory" />
+   </updateRequestProcessorChain>
+ }}}
+