You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "王海涛 (JIRA)" <ji...@apache.org> on 2016/12/27 05:40:58 UTC

[jira] [Updated] (SOLR-9894) Tokenizer work randomly

     [ https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

王海涛 updated SOLR-9894:
----------------------
    Description: 
my schema.xml has a fieldType as folow:
<fieldType name="my_ik" class="solr.TextField">
		<analyzer type="index">
			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" minTermLength="2"/> 
				<filter class="solr.LowerCaseFilterFactory"/>
			</analyzer>
		<analyzer type="query">
			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
		   <filter class="solr.LowerCaseFilterFactory"/>
		</analyzer>
	</fieldType>

Attention:
  index tokenzier useSmart is false
  query tokenzier useSmart is true

But when I send query request with parameter q ,
the query tokenziner sometimes useSmart equals true
sometimes useSmart equal false.
That is so terrible!
I guess the problem may be caught by tokenizer cache.
when I query ,the tokenizer should use true as the useSmart's value,
but it had cache the wrong tokenizer result which created by indexWriter who use false as useSmart's value.

  was:
my schema.xml has a fieldType as folow:
<fieldType name="my_ik" class="solr.TextField">
		<analyzer type="index">
			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
				
				
				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" minTermLength="2"/> 
				<filter class="solr.LowerCaseFilterFactory"/>
			</analyzer>
		<analyzer type="query">
			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
		   <filter class="solr.LowerCaseFilterFactory"/>
		</analyzer>
	</fieldType>

Attention:
  index tokenzier useSmart is false
  query tokenzier useSmart is true

But when I send query request with parameter q ,
the query tokenziner sometimes useSmart equals true
sometimes useSmart equal false.
That is so terrible!
I guess the problem may be caught by tokenizer cache.
when I query ,the tokenizer should use true as the useSmart's value,
but it had cache the wrong tokenizer result which created by indexWriter who use false as useSmart's value.


> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
> 		<analyzer type="index">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
> 				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true" minTermLength="2"/> 
> 				<filter class="solr.LowerCaseFilterFactory"/>
> 			</analyzer>
> 		<analyzer type="query">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
> 		   <filter class="solr.LowerCaseFilterFactory"/>
> 		</analyzer>
> 	</fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who use false as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org