You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by John Paul Sondag <js...@uiuc.edu> on 2007/07/30 18:05:15 UTC

Tokenizer

I have two questions.

First, Is there a tokenizer that takes every word and simply makes a token
out of it?  So it looks for two white spaces and takes the characters
between them and makes a token out of them?

If this tokenizer exists, is there a difference between doing that and
simply storing the field in the document with Field.Index = UN_TOKENIZED?

--JP