You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by neosky <ne...@yahoo.com> on 2012/04/07 19:15:06 UTC
Two questions about the Ngramtokenizerfactory
I use the solr 3.5 version
1. It seems that the Ngramtokenizerfactory only token the first 1024
characters. I search the problem on the Internet, somebody had noticed the
bug in 2007, but I can't find the solution.
ps: my max field length has been modified
<maxFieldLength>50000</maxFieldLength>
This is very critical for me.
2.the second questions that when I defines the
minGramSize=3
maxGramSize=8
what happens when I search a query length is 5. Does it work?
My consideration is to use the copyfiled to specify the gram from 3,8, I am
not sure it is a solution.I am very worry about the index speed. I spend
more than 6 hours to index the gram from 7,8 for testing.
Thanks!
--
View this message in context: http://lucene.472066.n3.nabble.com/Two-questions-about-the-Ngramtokenizerfactory-tp3893045p3893045.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Two questions about the Ngramtokenizerfactory
Posted by neosky <ne...@yahoo.com>.
neosky wrote
>
> I use the solr 3.5 version
> 1. It seems that the Ngramtokenizerfactory only token the first 1024
> characters. I search the problem on the Internet, somebody had noticed the
> bug in 2007, but I can't find the solution.
> ps: my max field length has been modified
> <maxFieldLength>50000</maxFieldLength>
> This is very critical for me.
>
> It is not fixed as I know. In the NGramTokenizer
> char[] chars = new char[1024];
> input.read(chars);
> but I don't know what's the different between NGramTokenizer and
> NGramTokenFilter
> suppose I want to write my Analyzier which should I use?
>
>
> 2.the second questions that when I defines the
> minGramSize=3
> maxGramSize=8
> what happens when I search a query length is 5. Does it work?
> My consideration is to use the copyfiled to specify the gram from 3,8, I
> am not sure it is a solution.I am very worry about the index speed. I
> spend more than 6 hours to index the gram from 7,8 for testing.
> Thanks!
>
I still need time to index to test.
--
View this message in context: http://lucene.472066.n3.nabble.com/Two-questions-about-the-Ngramtokenizerfactory-tp3893045p3894851.html
Sent from the Solr - User mailing list archive at Nabble.com.