You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by John Paul Sondag <js...@uiuc.edu> on 2007/07/30 18:05:15 UTC
Tokenizer
I have two questions.
First, Is there a tokenizer that takes every word and simply makes a token
out of it? So it looks for two white spaces and takes the characters
between them and makes a token out of them?
If this tokenizer exists, is there a difference between doing that and
simply storing the field in the document with Field.Index = UN_TOKENIZED?
--JP