You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jonathan Franzone <jo...@franzone.com> on 2002/02/26 18:26:56 UTC

Boolean Query Parsing with "IN" keyword

*This message was transferred with a trial version of CommuniGate(tm) Pro*

I'm trying to search on a US State field. The lucene field name is "state"
and so I'm building a query like: +(state:fl state:al state:in) to search
for documents in Florida, Alabama, or Indiana. But whenever I pass "in" or
"IN" to the QueryParser it strips it out. Passing the above query to the
QueryParser yields +(state:fl state:al). Is there a way to escape the "in"
keyword? I've tried enclosing it in double and single quotes, neither of
which worked.

Thanks,
Jonathan Franzone



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: Boolean Query Parsing with "IN" keyword

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Jonathan,

That's most likely caused by StandardAnalyzer, which you are probably
using.  'in' is listed as one of the stop words:

    public static final String[] STOP_WORDS = {
        "a", "and", "are", "as", "at", "be", "but", "by",
        "for", "if", "in", "into", "is", "it",
        "no", "not", "of", "on", "or", "s", "such",
        "t", "that", "the", "their", "then", "there", "these",
        "they", "this", "to", "was", "will", "with"
    };

Try searching for state:or
It should yield no matches.

But, StandardAnalyzer is no longer final (get the latest build) and you
can write a class that subclasses it and calls this StandardAnalyser
constructor:

    /** Builds an analyzer with the given stop words. */
    public StandardAnalyzer(String[] stopWords) {
        stopTable = StopFilter.makeStopTable(stopWords);
    }

Pass it your own list of stop words and you are done.
If you've already indexed some data you have to be careful which words
you choose as stop words.  I suggest sticking with the above list
(minus 'in', 'or', etc.) for now.
Once you have your class use it instead of StandardAnalyzer.

Otis




--- Jonathan Franzone <jo...@franzone.com> wrote:
> *This message was transferred with a trial version of CommuniGate(tm)
> Pro*
> 
> I'm trying to search on a US State field. The lucene field name is
> "state"
> and so I'm building a query like: +(state:fl state:al state:in) to
> search
> for documents in Florida, Alabama, or Indiana. But whenever I pass
> "in" or
> "IN" to the QueryParser it strips it out. Passing the above query to
> the
> QueryParser yields +(state:fl state:al). Is there a way to escape the
> "in"
> keyword? I've tried enclosing it in double and single quotes, neither
> of
> which worked.
> 
> Thanks,
> Jonathan Franzone
> 
> 
> 
> --
> To unsubscribe, e-mail:  
> <ma...@jakarta.apache.org>
> For additional commands, e-mail:
> <ma...@jakarta.apache.org>
> 


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>