You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mark <st...@gmail.com> on 2011/09/27 02:12:30 UTC

Searching multiple fields

I have a use case where I would like to search across two fields but I 
do not want to weight a document that has a match in both fields higher 
than a document that has a match in only 1 field.

For example.

Document 1
  - Field A: "Foo Bar"
  - Field B: "Foo Baz"

Document 2
  - Field A: "Foo Blarg"
  - Field B: "Something else"

Now when I search for "Foo" I would like document 1 and 2 to be 
similarly scored however document 1 will be scored much higher in this 
use case because it matches in both fields. I could create a third field 
and use copyField directive to search across that but I was wondering if 
there is an alternative way. It would be nice if we could search across 
some sort of "virtual field" that will use both underlying fields but 
not actually increase the size of the index.

Thanks

Re: Searching multiple fields

Posted by Way Cool <wa...@gmail.com>.
It will be nice if we can have dissum in addition to dismax. ;-)

On Tue, Sep 27, 2011 at 9:26 AM, lee carroll
<le...@googlemail.com>wrote:

> see
>
>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html
>
>
>
> On 27 September 2011 16:04, Mark <st...@gmail.com> wrote:
> > I thought that a similarity class will only affect the scoring of a
> single
> > field.. not across multiple fields? Can anyone else chime in with some
> > input? Thanks.
> >
> > On 9/26/11 9:02 PM, Otis Gospodnetic wrote:
> >>
> >> Hi Mark,
> >>
> >> Eh, I don't have Lucene/Solr source code handy, but I *think* for that
> >> you'd need to write custom Lucene similarity.
> >>
> >> Otis
> >> ----
> >> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> >> Lucene ecosystem search :: http://search-lucene.com/
> >>
> >>
> >>> ________________________________
> >>> From: Mark<st...@gmail.com>
> >>> To: solr-user@lucene.apache.org
> >>> Sent: Monday, September 26, 2011 8:12 PM
> >>> Subject: Searching multiple fields
> >>>
> >>> I have a use case where I would like to search across two fields but I
> do
> >>> not want to weight a document that has a match in both fields higher
> than a
> >>> document that has a match in only 1 field.
> >>>
> >>> For example.
> >>>
> >>> Document 1
> >>> - Field A: "Foo Bar"
> >>> - Field B: "Foo Baz"
> >>>
> >>> Document 2
> >>> - Field A: "Foo Blarg"
> >>> - Field B: "Something else"
> >>>
> >>> Now when I search for "Foo" I would like document 1 and 2 to be
> similarly
> >>> scored however document 1 will be scored much higher in this use case
> >>> because it matches in both fields. I could create a third field and use
> >>> copyField directive to search across that but I was wondering if there
> is an
> >>> alternative way. It would be nice if we could search across some sort
> of
> >>> "virtual field" that will use both underlying fields but not actually
> >>> increase the size of the index.
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >
>

Re: Searching multiple fields

Posted by lee carroll <le...@googlemail.com>.
see

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html



On 27 September 2011 16:04, Mark <st...@gmail.com> wrote:
> I thought that a similarity class will only affect the scoring of a single
> field.. not across multiple fields? Can anyone else chime in with some
> input? Thanks.
>
> On 9/26/11 9:02 PM, Otis Gospodnetic wrote:
>>
>> Hi Mark,
>>
>> Eh, I don't have Lucene/Solr source code handy, but I *think* for that
>> you'd need to write custom Lucene similarity.
>>
>> Otis
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> Lucene ecosystem search :: http://search-lucene.com/
>>
>>
>>> ________________________________
>>> From: Mark<st...@gmail.com>
>>> To: solr-user@lucene.apache.org
>>> Sent: Monday, September 26, 2011 8:12 PM
>>> Subject: Searching multiple fields
>>>
>>> I have a use case where I would like to search across two fields but I do
>>> not want to weight a document that has a match in both fields higher than a
>>> document that has a match in only 1 field.
>>>
>>> For example.
>>>
>>> Document 1
>>> - Field A: "Foo Bar"
>>> - Field B: "Foo Baz"
>>>
>>> Document 2
>>> - Field A: "Foo Blarg"
>>> - Field B: "Something else"
>>>
>>> Now when I search for "Foo" I would like document 1 and 2 to be similarly
>>> scored however document 1 will be scored much higher in this use case
>>> because it matches in both fields. I could create a third field and use
>>> copyField directive to search across that but I was wondering if there is an
>>> alternative way. It would be nice if we could search across some sort of
>>> "virtual field" that will use both underlying fields but not actually
>>> increase the size of the index.
>>>
>>> Thanks
>>>
>>>
>>>
>

Re: Searching multiple fields

Posted by Mark <st...@gmail.com>.
I thought that a similarity class will only affect the scoring of a 
single field.. not across multiple fields? Can anyone else chime in with 
some input? Thanks.

On 9/26/11 9:02 PM, Otis Gospodnetic wrote:
> Hi Mark,
>
> Eh, I don't have Lucene/Solr source code handy, but I *think* for that you'd need to write custom Lucene similarity.
>
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>> ________________________________
>> From: Mark<st...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Monday, September 26, 2011 8:12 PM
>> Subject: Searching multiple fields
>>
>> I have a use case where I would like to search across two fields but I do not want to weight a document that has a match in both fields higher than a document that has a match in only 1 field.
>>
>> For example.
>>
>> Document 1
>> - Field A: "Foo Bar"
>> - Field B: "Foo Baz"
>>
>> Document 2
>> - Field A: "Foo Blarg"
>> - Field B: "Something else"
>>
>> Now when I search for "Foo" I would like document 1 and 2 to be similarly scored however document 1 will be scored much higher in this use case because it matches in both fields. I could create a third field and use copyField directive to search across that but I was wondering if there is an alternative way. It would be nice if we could search across some sort of "virtual field" that will use both underlying fields but not actually increase the size of the index.
>>
>> Thanks
>>
>>
>>

Re: Searching multiple fields

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Mark,

Eh, I don't have Lucene/Solr source code handy, but I *think* for that you'd need to write custom Lucene similarity.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>________________________________
>From: Mark <st...@gmail.com>
>To: solr-user@lucene.apache.org
>Sent: Monday, September 26, 2011 8:12 PM
>Subject: Searching multiple fields
>
>I have a use case where I would like to search across two fields but I do not want to weight a document that has a match in both fields higher than a document that has a match in only 1 field.
>
>For example.
>
>Document 1
>- Field A: "Foo Bar"
>- Field B: "Foo Baz"
>
>Document 2
>- Field A: "Foo Blarg"
>- Field B: "Something else"
>
>Now when I search for "Foo" I would like document 1 and 2 to be similarly scored however document 1 will be scored much higher in this use case because it matches in both fields. I could create a third field and use copyField directive to search across that but I was wondering if there is an alternative way. It would be nice if we could search across some sort of "virtual field" that will use both underlying fields but not actually increase the size of the index.
>
>Thanks
>
>
>

Re: Searching multiple fields

Posted by Chris Hostetter <ho...@fucit.org>.
: I have a use case where I would like to search across two fields but I do not
: want to weight a document that has a match in both fields higher than a
: document that has a match in only 1 field.

use dismax, set the "tie" param to "0.0" (so it's a true "max" with no 
score boost for matching in multiple fields)

https://wiki.apache.org/solr/DisMax
http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/


-Hoss