You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "王海涛 (JIRA)" <ji...@apache.org> on 2016/12/28 02:11:58 UTC

[jira] [Comment Edited] (SOLR-9894) Tokenizer work randomly

    [ https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781805#comment-15781805 ] 

王海涛 edited comment on SOLR-9894 at 12/28/16 2:11 AM:
-----------------------------------------------------

I operate this 4 steps one by one. setp1 ---> step2 ---> step3 ---> step4.

It guess that the step1 made solr cache the tokenizer's index result not tokenizer's query result, 
so that step2 use tokenizer's index result but the query should use tokenzier's query result.

when step1 then step2;   98%  possibility
when step3 then step4;   98%  possibility


was (Author: wanghaitao):
I operate this 4 steps one by one. setp1-->step2-->step3-->step4.
It guess that the step1 made solr cache the tokenizer's index result not tokenizer's query result, 
so that step2 use tokenizer's index result but the query should use tokenzier's query result.

when step1 then step2;   98%  possibility
when step3 then step4;   98%  possibility

> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>         Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
> 		<analyzer type="index">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
> 				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" minTermLength="2"/> 
> 				<filter class="solr.LowerCaseFilterFactory"/>
> 			</analyzer>
> 		<analyzer type="query">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
> 		   <filter class="solr.LowerCaseFilterFactory"/>
> 		</analyzer>
> 	</fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org