You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Daniel Naber <lu...@danielnaber.de> on 2007/07/31 00:34:08 UTC

searching multiple fields

Hi,

I want to search multiple fields by default (which is no supported by 
StandardRequestHandler), but I also want to be able to use Lucene's 
boolean syntax (AND/OR/NOT). This doesn't seem to be supported by 
DisMaxRequestHandler. I will need to copy or extend StandardRequestHandler 
and modify it, including the query parser it calls, or am I missing an 
easier alternative?

Regards
 Daniel

-- 
http://www.danielnaber.de

Re: searching multiple fields

Posted by Daniel Naber <lu...@danielnaber.de>.
On Thursday 02 August 2007 20:18, Walter Underwood wrote:

> I agree about the fussiness and mystery of good values for minimum
> match, but the requestor wanted 100% all the time. That is easy.

But I want it only by default, with an easy way to go back to OR for parts 
of the query, e.g. doing a search like: linux (speed OR performance)

Regards
 Daniel

-- 
http://www.danielnaber.de

Re: searching multiple fields

Posted by Walter Underwood <wu...@netflix.com>.
I agree about the fussiness and mystery of good values for minimum match,
but the requestor wanted 100% all the time. That is easy.

I think spell suggestions are harder than search, so "assume great spell
suggestions" is not a good fix for a bad default (all terms).

wunder


On 8/2/07 11:13 AM, "Daniel Naber" <lu...@danielnaber.de> wrote:

> On Thursday 02 August 2007 18:46, Walter Underwood wrote:
> 
>> Use the minimum match spec for a flexible version of all-terms
>> matching.
> 
> I think this is too difficult and unpredictable. I also don't know how I
> should justify a setting like "75%", just because it maybe works fine for
> some examples.
> 
>> One wrong or misspelled word means no matches, and searchers don't
>> know how to fix their query. If they couldn't spell it the first time,
>> why should they be able to spell it a second time?
> 
> That's what the spell checker is for.
> 
> Regards
>  Daniel


Re: searching multiple fields

Posted by Daniel Naber <lu...@danielnaber.de>.
On Thursday 02 August 2007 18:46, Walter Underwood wrote:

> Use the minimum match spec for a flexible version of all-terms
> matching.

I think this is too difficult and unpredictable. I also don't know how I 
should justify a setting like "75%", just because it maybe works fine for 
some examples.

> One wrong or misspelled word means no matches, and searchers don't
> know how to fix their query. If they couldn't spell it the first time,
> why should they be able to spell it a second time?

That's what the spell checker is for.

Regards
 Daniel

-- 
http://www.danielnaber.de

Re: searching multiple fields

Posted by Walter Underwood <wu...@netflix.com>.
Use the minimum match spec for a flexible version of all-terms
matching. 

Before implementing all-terms matching, start logging the number of
searches that result in no matches. All-terms can cause big problems.
One wrong or misspelled word means no matches, and searchers don't
know how to fix their query. If they couldn't spell it the first time,
why should they be able to spell it a second time?

wunder

On 8/1/07 11:15 AM, "Daniel Naber" <lu...@danielnaber.de> wrote:

> On Wednesday 01 August 2007 09:47, Chris Hostetter wrote:
> 
>> for the record, using the Lucene boolean options "+" and "-" do work in
>> the "q" expression for the dismax handler ... for that matter, the
>> boolean keywords AND, OR, and NOT work as well
> 
> The only case that doesn't seem to work (and that's the one I'm interested
> in) is to have AND by default. With DisMaxReqHandler you can have AND by
> default for all terms, but as you don't have the OR operator you have
> *only* AND...
> 
> Regards
>  Daniel


Re: searching multiple fields

Posted by Daniel Naber <lu...@danielnaber.de>.
On Wednesday 01 August 2007 09:47, Chris Hostetter wrote:

> for the record, using the Lucene boolean options "+" and "-" do work in
> the "q" expression for the dismax handler ... for that matter, the
> boolean keywords AND, OR, and NOT work as well

The only case that doesn't seem to work (and that's the one I'm interested 
in) is to have AND by default. With DisMaxReqHandler you can have AND by 
default for all terms, but as you don't have the OR operator you have 
*only* AND...

Regards
 Daniel

-- 
http://www.danielnaber.de

Re: searching multiple fields

Posted by Walter Underwood <wu...@netflix.com>.
You get that behavior by avoiding any extra syntax. Use this query:

  a:valueAlpha b:valueBeta c:valueGamma

If one of the terms is very common and one is very rare, it might
not sort on pure existance. This is a tf.idf engine.

wunder

On 8/1/07 11:00 AM, "Lance Lance" <go...@gmail.com> wrote:

> On this subject:
> 
> I thought that this query would find at least one of the given values:
> +(a:valueAlpha a:valueBeta a:valueGamma)
> It would sort returns by 'have all 3', 'have 2', and 'have 1'. In fact, it
> only finds records with all three. That is, it is exactly the same as:
> +a:valueAlpha +a:valueBeta +a:valueGamma
> I have to use OR between the values.
> 
> Is this supposed to be true?
> 
> Thanks,
> 
> Lance
> 
> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_lucene@fucit.org]
> Sent: Wednesday, August 01, 2007 12:48 AM
> To: solr-user@lucene.apache.org
> Subject: Re: searching multiple fields
> 
> 
> : > StandardRequestHandler), but I also want to be able to use Lucene's
> : > boolean syntax (AND/OR/NOT). This doesn't seem to be supported by
> : > DisMaxRequestHandler. I will need to copy or extend
> 
> for the record, using the Lucene boolean options "+" and "-" do work in the
> "q" expression for the dismax handler ... for that matter, the boolean
> keywords AND, OR, and NOT work as well (allthough i never intended them to.
> funny story: when i was writing dismax, i assumed i needed to do something
> to prevent AND/OR/NOT from working, after writing most of it i went to test
> it and discovered they didn't work and figured something else i was doing in
> my QUeryParser subclass was alrady taking care of it and moved on to deal
> with other problems --- it wasn't until months later that i realized i was
> an idiot and was typing "and" but the QueryParser only recognizes the
> uppercase versions)
> 
> The only part of the "boolean" syntax that doesn't work is compelx boolean
> expressions using parens.
> 
> 
> 
> 
> 
> -Hoss
> 


RE: searching multiple fields

Posted by Lance Lance <go...@gmail.com>.
On this subject:

I thought that this query would find at least one of the given values:
	 +(a:valueAlpha a:valueBeta a:valueGamma) 
It would sort returns by 'have all 3', 'have 2', and 'have 1'. In fact, it
only finds records with all three. That is, it is exactly the same as:
	+a:valueAlpha +a:valueBeta +a:valueGamma
I have to use OR between the values.

Is this supposed to be true?

Thanks,

Lance

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: Wednesday, August 01, 2007 12:48 AM
To: solr-user@lucene.apache.org
Subject: Re: searching multiple fields


: > StandardRequestHandler), but I also want to be able to use Lucene's
: > boolean syntax (AND/OR/NOT). This doesn't seem to be supported by
: > DisMaxRequestHandler. I will need to copy or extend

for the record, using the Lucene boolean options "+" and "-" do work in the
"q" expression for the dismax handler ... for that matter, the boolean
keywords AND, OR, and NOT work as well (allthough i never intended them to.
funny story: when i was writing dismax, i assumed i needed to do something
to prevent AND/OR/NOT from working, after writing most of it i went to test
it and discovered they didn't work and figured something else i was doing in
my QUeryParser subclass was alrady taking care of it and moved on to deal
with other problems --- it wasn't until months later that i realized i was
an idiot and was typing "and" but the QueryParser only recognizes the
uppercase versions)

The only part of the "boolean" syntax that doesn't work is compelx boolean
expressions using parens.





-Hoss


Re: searching multiple fields

Posted by Walter Underwood <wu...@netflix.com>.
This caused me a certain amount of trouble, because the parser
errors with ill-formed queries. Try these:

   foo -
   TO HAVE AND HAVE NOT

wunder

On 8/1/07 12:47 AM, "Chris Hostetter" <ho...@fucit.org> wrote:

> 
> : > StandardRequestHandler), but I also want to be able to use Lucene's
> : > boolean syntax (AND/OR/NOT). This doesn't seem to be supported by
> : > DisMaxRequestHandler. I will need to copy or extend
> 
> for the record, using the Lucene boolean options "+" and "-" do work in
> the "q" expression for the dismax handler ... for that matter, the boolean
> keywords AND, OR, and NOT work as well (allthough i never intended them
> to.  funny story: when i was writing dismax, i assumed i needed to do
> something to prevent AND/OR/NOT from working, after writing most of it i
> went to test it and discovered they didn't work and figured something else
> i was doing in my QUeryParser subclass was alrady taking care of it and
> moved on to deal with other problems --- it wasn't until months later that
> i realized i was an idiot and was typing "and" but the QueryParser only
> recognizes the uppercase versions)
> 
> The only part of the "boolean" syntax that doesn't work is compelx boolean
> expressions using parens.
> 
> 
> 
> 
> 
> -Hoss
> 


Re: searching multiple fields

Posted by Chris Hostetter <ho...@fucit.org>.
: > StandardRequestHandler), but I also want to be able to use Lucene's
: > boolean syntax (AND/OR/NOT). This doesn't seem to be supported by
: > DisMaxRequestHandler. I will need to copy or extend

for the record, using the Lucene boolean options "+" and "-" do work in
the "q" expression for the dismax handler ... for that matter, the boolean
keywords AND, OR, and NOT work as well (allthough i never intended them
to.  funny story: when i was writing dismax, i assumed i needed to do
something to prevent AND/OR/NOT from working, after writing most of it i
went to test it and discovered they didn't work and figured something else
i was doing in my QUeryParser subclass was alrady taking care of it and
moved on to deal with other problems --- it wasn't until months later that
i realized i was an idiot and was typing "and" but the QueryParser only
recognizes the uppercase versions)

The only part of the "boolean" syntax that doesn't work is compelx boolean
expressions using parens.





-Hoss


Re: searching multiple fields

Posted by Mike Klaas <mi...@gmail.com>.
On 30-Jul-07, at 3:34 PM, Daniel Naber wrote:

> Hi,
>
> I want to search multiple fields by default (which is no supported by
> StandardRequestHandler), but I also want to be able to use Lucene's
> boolean syntax (AND/OR/NOT). This doesn't seem to be supported by
> DisMaxRequestHandler. I will need to copy or extend  
> StandardRequestHandler
> and modify it, including the query parser it calls, or am I missing an
> easier alternative?

You could write your own request handler, but I think the route that  
most people take is to stop using lucene query syntax and move to  
dismax params to express requirements.  This also plays better with  
caching.

NOT clauses -> fqs

required clauses (either OR or AND) -> q + mm

purely optional clauses -> bq/bf

If you want complicated (read: parenthesized) boolean logic, it's  
best to develop your own solution.

-Mike