You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2009/07/07 20:42:43 UTC

[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by ShalinMangar

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ShalinMangar:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

The comment on the change is:
Added note on preserveOriginal and splitOnNumerics

------------------------------------------------------------------------------
   * '''splitOnCaseChange="1"''' causes lowercase => uppercase transitions to generate a new part [Solr 1.3]:
     * `"PowerShot" => "Power" "Shot"`
     * `"TransAM" => "Trans" "AM"`
+  * '''splitOnNumerics="1"''' causes alphabet => number transitions to generate a new part [Solr 1.3]:
+    * `"j2se" => "j" "2" "se"`
  Note that this is the default behaviour in all released versions of Solr.
  
  There are also a number of parameters that affect what tokens are present in the final output and if subwords are combined:
@@ -372, +374 @@

     * `"500-42" => "50042"`
   * '''catenateAll="1"''' causes all subword parts to be catenated:
     * `"wi-fi-4000" => "wifi4000"`
+  * '''preserveOriginal="1"''' causes the original token to be indexed without modifications (in addition to the tokens produced due to other options)
  
  These parameters may be combined in any way.
   * Example of generateWordParts="1" and  catenateWords="1":
@@ -391, +394 @@

                  catenateWords="0"
                  catenateNumbers="0"
                  catenateAll="0"
+                 preserveOriginal="1"
                  />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory"/>
@@ -404, +408 @@

                  catenateWords="1"
                  catenateNumbers="1"
                  catenateAll="0"
+                 preserveOriginal="1"
                  />
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory"/>