You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bill Au <bi...@gmail.com> on 2007/05/04 21:17:37 UTC

QueryParser, PrefixQuery, and case sensitivity

I have an index with both fields that are case sensitive and insensitive.  I
am trying to use a QueryParser to accept query from end users for
searching.  The default behavior of QueryParser is to lowercase the prefix
text to create the PrefixQuery.  So wildcard search on the case sensitive
fields does not work.  If I use QueryParser.setLowercaseWildcardTerm(false),
then wildcard search on the case insensitive fields does not work.

Here is an example with two fields, name (case sensitive) and desc (case
insensitive).  The docment is

name (case sensitive): PowerBook
desc (case insensitive): professional  mac laptop

I want to be able to find the document with the following query:

+name:Power* +field:Pro*

Bill

Re: QueryParser, PrefixQuery, and case sensitivity

Posted by Bill Au <bi...@gmail.com>.
Erick,
     Thanks for the advice.  I will take a look at
PerFieldAnalyzerWrapper to see if I want to take this on.  For my
case, I have to use mexed case for a couple of fields since case
really does matter for them (ie apple is not the same as Apple), and I
actually don't want users to find the document if they enter the
"wrong" case (ie a search for Apple should not return docs containing
apple).

Bill

On 5/4/07, Erick Erickson <er...@gmail.com> wrote:
> Look at PerFieldAnalyzerWrapper. It allows you to use different
> analyzers on different fields during the query parsing phase.
>
> But I wouldn't go there if you don't have to. I suspect you'll spend a
> LOT of time tracking down errors in your use of a mixed case index.
> If for no other reason than your users will use the "wrong" case.
>
> Unless your index is huge (and I don't consider, say, 8G huge), I'd
> index everything in, say, lower case. And ditto for your query
> parsing.
>
> If you need to return data to the user in mixed case, then you can
> *store* (but perhaps not *index*) the display fields. So you search
> on one field and return data from another.
>
> Best
> Erick
>
> On 5/4/07, Bill Au <bi...@gmail.com> wrote:
> >
> > I have an index with both fields that are case sensitive and
> > insensitive.  I
> > am trying to use a QueryParser to accept query from end users for
> > searching.  The default behavior of QueryParser is to lowercase the prefix
> > text to create the PrefixQuery.  So wildcard search on the case sensitive
> > fields does not work.  If I use QueryParser.setLowercaseWildcardTerm
> > (false),
> > then wildcard search on the case insensitive fields does not work.
> >
> > Here is an example with two fields, name (case sensitive) and desc (case
> > insensitive).  The docment is
> >
> > name (case sensitive): PowerBook
> > desc (case insensitive): professional  mac laptop
> >
> > I want to be able to find the document with the following query:
> >
> > +name:Power* +field:Pro*
> >
> > Bill
> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: QueryParser, PrefixQuery, and case sensitivity

Posted by Erick Erickson <er...@gmail.com>.
Look at PerFieldAnalyzerWrapper. It allows you to use different
analyzers on different fields during the query parsing phase.

But I wouldn't go there if you don't have to. I suspect you'll spend a
LOT of time tracking down errors in your use of a mixed case index.
If for no other reason than your users will use the "wrong" case.

Unless your index is huge (and I don't consider, say, 8G huge), I'd
index everything in, say, lower case. And ditto for your query
parsing.

If you need to return data to the user in mixed case, then you can
*store* (but perhaps not *index*) the display fields. So you search
on one field and return data from another.

Best
Erick

On 5/4/07, Bill Au <bi...@gmail.com> wrote:
>
> I have an index with both fields that are case sensitive and
> insensitive.  I
> am trying to use a QueryParser to accept query from end users for
> searching.  The default behavior of QueryParser is to lowercase the prefix
> text to create the PrefixQuery.  So wildcard search on the case sensitive
> fields does not work.  If I use QueryParser.setLowercaseWildcardTerm
> (false),
> then wildcard search on the case insensitive fields does not work.
>
> Here is an example with two fields, name (case sensitive) and desc (case
> insensitive).  The docment is
>
> name (case sensitive): PowerBook
> desc (case insensitive): professional  mac laptop
>
> I want to be able to find the document with the following query:
>
> +name:Power* +field:Pro*
>
> Bill
>