You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joerg Erdmenger <jo...@gmail.com> on 2006/02/14 09:45:13 UTC

unwanted processing of keyword searches

Hi,

I have a little problem I'm not sure is with Lucene or with me not
understanding correctly. I have built a search tool. I use a custom analyzer
that is very simple and just chains some of the standard filters like this

TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new LowerCaseFilter(result);
result = new StopFilter(result, stopSet);
result = new PorterStemFilter(result);
return result;

now I store an ISBN number in my index (that gets created by an IndexWriter
using the above mentioned analyzer) like this
Field.Keyword("isbn", "my isbn no");

now if I look at the index with Luke the ISBN no gets stored exactly as I
would expect - unmodified, which is especially important as ISBN numbers can
contain a capital 'x.'

Anyway the problem occurs when I now do a search for an ISBN using a query
like this

isbn:012345678X

using a QueryParser that uses above mentioned analyzer again. For some
reason the query ends up as isbn:012345678x which doesn't return anything.
Now am I missing something here? Should the query parser not do its
modifications depending on the Fieldtype searched and therfore not run my
query through the analyzer if it is a Keyword field (i.e. no tokenization)?
Or do I need to do something special.

Slightly confused

Jörg