You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by neosky <ne...@yahoo.com> on 2012/04/10 06:52:57 UTC
which approach is correct?
Here are my fields
<field name=id>101</field><field name=sequence>NGHGJGKGKLHJFKGJGKGK</field>
the sequence field is from 300 bytes to 56K bytes, no spaces
I want to ngram from 3 to 8
NGH GHG HGJ ...
NGHG GHGJ HGJG ...
...
<fieldType name="nGram1" class="solr.TextField"
positionIncrementGap="100" stored="false" multiValued="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"
maxTokenLength="56000" />
<filter class="solr.NGramFilterFactory" minGramSize="3"
maxGramSize="8"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
</analyzer>
</fieldType>
<fieldType name="nGram2" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3"
maxGramSize="8" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory" />
</analyzer>
</fieldType>
--
View this message in context: http://lucene.472066.n3.nabble.com/which-approach-is-correct-tp3898711p3898711.html
Sent from the Solr - User mailing list archive at Nabble.com.