You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Ryan McKinley (JIRA)" <ji...@apache.org> on 2007/04/24 00:02:15 UTC
[jira] Updated: (SOLR-211) regex split() Tokenizer
[ https://issues.apache.org/jira/browse/SOLR-211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan McKinley updated SOLR-211:
-------------------------------
Attachment: SOLR-211-RegexSplitTokenizer.patch
simple regex tokenizer and a test.
<fieldType name="splitText" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.RegexSplitTokenizerFactory" regex="--"/>
<filter class="solr.TrimFilterFactory" />
</analyzer>
</fieldType>
Given a field:
"Architecture--United States--19th century"
will create tokens for:
"Architecture"
"United States"
"19th century"
> regex split() Tokenizer
> -----------------------
>
> Key: SOLR-211
> URL: https://issues.apache.org/jira/browse/SOLR-211
> Project: Solr
> Issue Type: New Feature
> Components: search
> Reporter: Ryan McKinley
> Attachments: SOLR-211-RegexSplitTokenizer.patch
>
>
> A TokenizerFactory that makes tokens from:
> string.split( regex );
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.