You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ashish P <as...@gmail.com> on 2009/04/28 07:01:45 UTC
half width katakana
I want to convert half width katakana to full width katakana. I tried using
cjk analyzer but not working.
Does cjkAnalyzer do it or is there any other way??
--
View this message in context: http://www.nabble.com/half-width-katakana-tp23270186p23270186.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: half width katakana
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Chris Hostetter wrote:
> : The exception is expected if you use CharStream aware Tokenizer without
> : CharFilters.
>
> Koji: i thought all of the casts had been eliminated and replaced with
> a call to CharReader.get(Reader) ?
>
>
Yeah, right. After r758137, ClassCastException should be eliminated.
http://svn.apache.org/viewvc?view=rev&revision=758137
And then CharReader.get(Reader) idiom added as hoss suggested:
http://svn.apache.org/viewvc?view=rev&revision=758161
Ashish, what revision/nightly version did you use when you got ClassCast
Exception?
Koji
Re: half width katakana
Posted by Chris Hostetter <ho...@fucit.org>.
: The exception is expected if you use CharStream aware Tokenizer without
: CharFilters.
Koji: i thought all of the casts had been eliminated and replaced with
a call to CharReader.get(Reader) ?
: Please see example/solr/conf/schema.xml for the setting of CharFilter and
: CharStreamAware*Tokenizer:
: > Using CharStreamAwareCJKTokenizerFactory is giving me following error,
: > SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
: > org.apache.solr.analysis.CharStream
: >
: > May be you are typecasting Reader to subclass.
-Hoss
Re: half width katakana
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
The exception is expected if you use CharStream aware Tokenizer without
CharFilters.
Please see example/solr/conf/schema.xml for the setting of CharFilter and
CharStreamAware*Tokenizer:
<!-- charFilter + "CharStream aware" WhitespaceTokenizer -->
<!--
<fieldType name="textCharNorm" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-ISOLatin1Accent.txt"/>
<tokenizer
class="solr.CharStreamAwareWhitespaceTokenizerFactory"/>
</analyzer>
</fieldType>
-->
Thank you,
Koji
Ashish P wrote:
> Koji san,
>
> Using CharStreamAwareCJKTokenizerFactory is giving me following error,
> SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
> org.apache.solr.analysis.CharStream
>
> May be you are typecasting Reader to subclass.
> Thanks,
> Ashish
>
>
> Koji Sekiguchi-2 wrote:
>
>> If you use CharFilter, you should use "CharStream aware" Tokenizer to
>> correct terms offsets.
>> There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
>> Probably you want to use CharStreamAwareCJKTokenizer(Factory).
>>
>> Koji
>>
>>
>> Ashish P wrote:
>>
>>> After this should I be using same cjkAnalyzer or use charFilter??
>>> Thanks,
>>> Ashish
>>>
>>>
>>> Koji Sekiguchi-2 wrote:
>>>
>>>
>>>> Ashish P wrote:
>>>>
>>>>
>>>>> I want to convert half width katakana to full width katakana. I tried
>>>>> using
>>>>> cjk analyzer but not working.
>>>>> Does cjkAnalyzer do it or is there any other way??
>>>>>
>>>>>
>>>>>
>>>> CharFilter which comes with trunk/Solr 1.4 just covers this type of
>>>> problem.
>>>> If you are using Solr 1.3, try the patch attached below:
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-822
>>>>
>>>> Koji
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
Re: half width katakana
Posted by Ashish P <as...@gmail.com>.
Koji san,
Using CharStreamAwareCJKTokenizerFactory is giving me following error,
SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
org.apache.solr.analysis.CharStream
May be you are typecasting Reader to subclass.
Thanks,
Ashish
Koji Sekiguchi-2 wrote:
>
> If you use CharFilter, you should use "CharStream aware" Tokenizer to
> correct terms offsets.
> There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
> Probably you want to use CharStreamAwareCJKTokenizer(Factory).
>
> Koji
>
>
> Ashish P wrote:
>> After this should I be using same cjkAnalyzer or use charFilter??
>> Thanks,
>> Ashish
>>
>>
>> Koji Sekiguchi-2 wrote:
>>
>>> Ashish P wrote:
>>>
>>>> I want to convert half width katakana to full width katakana. I tried
>>>> using
>>>> cjk analyzer but not working.
>>>> Does cjkAnalyzer do it or is there any other way??
>>>>
>>>>
>>> CharFilter which comes with trunk/Solr 1.4 just covers this type of
>>> problem.
>>> If you are using Solr 1.3, try the patch attached below:
>>>
>>> https://issues.apache.org/jira/browse/SOLR-822
>>>
>>> Koji
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>
--
View this message in context: http://www.nabble.com/half-width-katakana-tp23270186p23272475.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: half width katakana
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
If you use CharFilter, you should use "CharStream aware" Tokenizer to
correct terms offsets.
There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
Probably you want to use CharStreamAwareCJKTokenizer(Factory).
Koji
Ashish P wrote:
> After this should I be using same cjkAnalyzer or use charFilter??
> Thanks,
> Ashish
>
>
> Koji Sekiguchi-2 wrote:
>
>> Ashish P wrote:
>>
>>> I want to convert half width katakana to full width katakana. I tried
>>> using
>>> cjk analyzer but not working.
>>> Does cjkAnalyzer do it or is there any other way??
>>>
>>>
>> CharFilter which comes with trunk/Solr 1.4 just covers this type of
>> problem.
>> If you are using Solr 1.3, try the patch attached below:
>>
>> https://issues.apache.org/jira/browse/SOLR-822
>>
>> Koji
>>
>>
>>
>>
>>
>
>
Re: half width katakana
Posted by Ashish P <as...@gmail.com>.
After this should I be using same cjkAnalyzer or use charFilter??
Thanks,
Ashish
Koji Sekiguchi-2 wrote:
>
> Ashish P wrote:
>> I want to convert half width katakana to full width katakana. I tried
>> using
>> cjk analyzer but not working.
>> Does cjkAnalyzer do it or is there any other way??
>>
>
> CharFilter which comes with trunk/Solr 1.4 just covers this type of
> problem.
> If you are using Solr 1.3, try the patch attached below:
>
> https://issues.apache.org/jira/browse/SOLR-822
>
> Koji
>
>
>
>
--
View this message in context: http://www.nabble.com/half-width-katakana-tp23270186p23270453.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: half width katakana
Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Ashish P wrote:
> I want to convert half width katakana to full width katakana. I tried using
> cjk analyzer but not working.
> Does cjkAnalyzer do it or is there any other way??
>
CharFilter which comes with trunk/Solr 1.4 just covers this type of problem.
If you are using Solr 1.3, try the patch attached below:
https://issues.apache.org/jira/browse/SOLR-822
Koji