You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Michael Sokolov <so...@ifactory.com> on 2010/10/11 22:57:46 UTC
configuring custom CharStream in solr
I would like to inject my CharStream (or possibly it could be a CharFilter;
this is all in flux at the moment) into the analysis chain for a field. Can
I do this in solr using the Analyzer configuration syntax in schema.xml, or
would I need to define my own Analyzer? The solr wiki describes adding
Tokenizers, but doesn't say anything about CharReaders/Filters.
Thanks for any pointers
-Mike
Re: configuring custom CharStream in solr
Posted by Michael Sokolov <so...@ifactory.com>.
On 10/11/2010 10:18 PM, Chris Hostetter wrote:
> : OK - I found the answer pecking through the source - apparently the name of
> : the element to configure a CharFilter is<charFilter> - fancy that :)
>
> there's even an example, right there on the wiki...
>
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#CharFilterFactories
>
>
> -Hoss
I am just bathing myself in wizardly astuteness today
thanks
-Mike
Re: configuring custom CharStream in solr
Posted by Chris Hostetter <ho...@fucit.org>.
: OK - I found the answer pecking through the source - apparently the name of
: the element to configure a CharFilter is <charFilter> - fancy that :)
there's even an example, right there on the wiki...
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#CharFilterFactories
-Hoss
Re: configuring custom CharStream in solr
Posted by Michael Sokolov <so...@ifactory.com>.
On 10/11/2010 8:38 PM, Michael Sokolov wrote:
> On 10/11/2010 6:41 PM, Koji Sekiguchi wrote:
>> (10/10/12 5:57), Michael Sokolov wrote:
>>> I would like to inject my CharStream (or possibly it could be a
>>> CharFilter;
>>> this is all in flux at the moment) into the analysis chain for a
>>> field. Can
>>> I do this in solr using the Analyzer configuration syntax in
>>> schema.xml, or
>>> would I need to define my own Analyzer? The solr wiki describes adding
>>> Tokenizers, but doesn't say anything about CharReaders/Filters.
>>>
>>> Thanks for any pointers
>>>
>>> -Mike
>>>
>> Hi Mike,
>>
>> You can write your own CharFilterFactory that creates your own
>> CharStream. Please refer existing CharFilterFactories in Solr
>> to see how you can implement it.
>>
>> Koji
>>
> Koji - thanks for your response. I think I can see my way clear to
> making a factory class for my stream. My question was really about
> how to configure the factory. I see a number of examples of
> tokenizers and analyzers configured in the example schema.xml, but no
> readers. For example:
>
> <fieldType name="text_ws" class="solr.TextField">
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
>
> configures a specific tokenizer. If I want to configure my
> CharStream, is there an element for that? Eg:
>
> <fieldType name="text_ws" class="solr.TextField">
> <analyzer>
> <reader class="com.mycompany.solr.FancyCharReader" />
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> </analyzer>
> </fieldType>
>
> I am guessing that I need to create my own analyzer and hard-code the
> reader/tokenizer filter chain in there, but it would be nice if there
> were a syntax like the one I inferred above.
>
> -Mike
OK - I found the answer pecking through the source - apparently the name
of the element to configure a CharFilter is <charFilter> - fancy that :)
-MIke
Re: configuring custom CharStream in solr
Posted by Michael Sokolov <so...@ifactory.com>.
On 10/11/2010 6:41 PM, Koji Sekiguchi wrote:
> (10/10/12 5:57), Michael Sokolov wrote:
>> I would like to inject my CharStream (or possibly it could be a
>> CharFilter;
>> this is all in flux at the moment) into the analysis chain for a
>> field. Can
>> I do this in solr using the Analyzer configuration syntax in
>> schema.xml, or
>> would I need to define my own Analyzer? The solr wiki describes adding
>> Tokenizers, but doesn't say anything about CharReaders/Filters.
>>
>> Thanks for any pointers
>>
>> -Mike
>>
> Hi Mike,
>
> You can write your own CharFilterFactory that creates your own
> CharStream. Please refer existing CharFilterFactories in Solr
> to see how you can implement it.
>
> Koji
>
Koji - thanks for your response. I think I can see my way clear to
making a factory class for my stream. My question was really about how
to configure the factory. I see a number of examples of tokenizers and
analyzers configured in the example schema.xml, but no readers. For
example:
<fieldType name="text_ws" class="solr.TextField">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
configures a specific tokenizer. If I want to configure my CharStream,
is there an element for that? Eg:
<fieldType name="text_ws" class="solr.TextField">
<analyzer>
<reader class="com.mycompany.solr.FancyCharReader" />
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
I am guessing that I need to create my own analyzer and hard-code the
reader/tokenizer filter chain in there, but it would be nice if there
were a syntax like the one I inferred above.
-Mike
Re: configuring custom CharStream in solr
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(10/10/12 5:57), Michael Sokolov wrote:
> I would like to inject my CharStream (or possibly it could be a CharFilter;
> this is all in flux at the moment) into the analysis chain for a field. Can
> I do this in solr using the Analyzer configuration syntax in schema.xml, or
> would I need to define my own Analyzer? The solr wiki describes adding
> Tokenizers, but doesn't say anything about CharReaders/Filters.
>
> Thanks for any pointers
>
> -Mike
>
Hi Mike,
You can write your own CharFilterFactory that creates your own
CharStream. Please refer existing CharFilterFactories in Solr
to see how you can implement it.
Koji
--
http://www.rondhuit.com/en/