You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Melissa Hao <mh...@salesforce.com> on 2010/02/05 02:15:08 UTC

For a two word wildcard query like name:john s* , can I avoid expanding s* for the entire name field?

Hi,

I am wondering about wildcard queries that are more than one word, such as:

    name:john s*

Note: All terms are required (default boolean operator is AND).

I know that for the query

    name:s*

The s* is expanded over all s* terms in the name field.

For the "john s*" case, is it possible to restrict documents to
documents that have "john", before expanding s*?  I know that's not how
Lucene normally works.  Lucene normally

1) Preprocesses the s* by replacing s* with (s OR sa OR sb OR sc ...).
2) Runs the query (john AND (s OR sa OR sb OR sc ...)).  For AND, Lucene
does the optimization where it figures which term is in the least number
of documents.  But the s* expansion already happened in step 1.

I'm curious because single-character wildcards are expensive for us, but
if we could somehow expand the wildcard only over the "john" documents,
that would help.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org