You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Karthick Duraisamy Soundararaj <ka...@gmail.com> on 2012/04/02 16:28:14 UTC

Merging results from two queries

Hi all,
        I am finding a need to merge the results of multiple queries to
accomplish a functionality similar to this :

                     1. Make query 1
                     2. If results returned by query1 is less than a
certain threshold, then Make query 2

Extending this idea, I want to be able to create a query chain, i.e,
provide a functionality where you could specify n queries and n-1
thresholds in a single url. Start querying in the order from 1 to n until
one of them produces results that exceed the threshold.

PS: These n queries and n threshold are passed on a single url and each of
them could use different request handlers and therefore take a different
set of parameters.

Any suggestions/thoughts/pointers as where to begin looking for will be of
great help!

Thanks,
Karthick

Re: Merging results from two queries

Posted by John Chee <Jo...@mylife.com>.
Karthick,

I mean perform both query1 and query2 simultaneously, something like

query1: (superpower:flight OR name:batman)
query2: (superpower:speed AND name:flash)

Could be transformed to:

(superpower:flight^2 OR name:batman^2) OR (superpower:speed AND name:flash)

This would give you all results from query1 and query2 but results
matching query1 (at least parts of it) would be scored higher than
results from query2.

I don't know if it would be possible to do something like:

(query1)^N OR (query2)^(N-1) OR .. OR (queryN-1)^2 OR (queryN) but
that might be a general solution to your problem.

On Mon, Apr 2, 2012 at 6:51 PM, Karthick Duraisamy Soundararaj
<ka...@gmail.com> wrote:
> @Eric
> By threshold, all I mean is the count of the documents returned and I am
> not going to play with score. So if I have to commit my code to svn, whats
> the best way to go about it? I know I have to discuss my design here which
> would take atleast a couple of days. But is there special instructions that
> I need to follow in order to stay in a direction from where I could commit
> my code?
>
>
> @John
> Yes, thats definitely a solution but then I dont want to make two different
> http requests. I want to make 1 request and all that I mentioned has to
> happen.
>
>
>
> On Mon, Apr 2, 2012 at 7:28 PM, Erick Erickson <er...@gmail.com>wrote:
>
>> Part of it depends on what you mean by "threshold". If it's
>> just the number of matches, then fine. But if you're talking score
>> here, be very, very careful. Scores are not an absolute measure
>> of anything, they only tell you that "for _this_ query, the docs
>> should be order this way".
>>
>> So I'd advise against any "query chain" based on scores
>> as the threshold, if that's what you mean by "threshold".
>>
>> Best
>> Erick
>>
>> On Mon, Apr 2, 2012 at 10:28 AM, Karthick Duraisamy Soundararaj
>> <ka...@gmail.com> wrote:
>> > Hi all,
>> >        I am finding a need to merge the results of multiple queries to
>> > accomplish a functionality similar to this :
>> >
>> >                     1. Make query 1
>> >                     2. If results returned by query1 is less than a
>> > certain threshold, then Make query 2
>> >
>> > Extending this idea, I want to be able to create a query chain, i.e,
>> > provide a functionality where you could specify n queries and n-1
>> > thresholds in a single url. Start querying in the order from 1 to n until
>> > one of them produces results that exceed the threshold.
>> >
>> > PS: These n queries and n threshold are passed on a single url and each
>> of
>> > them could use different request handlers and therefore take a different
>> > set of parameters.
>> >
>> > Any suggestions/thoughts/pointers as where to begin looking for will be
>> of
>> > great help!
>> >
>> > Thanks,
>> > Karthick
>>

Re: Merging results from two queries

Posted by Karthick Duraisamy Soundararaj <ka...@gmail.com>.
@Eric
By threshold, all I mean is the count of the documents returned and I am
not going to play with score. So if I have to commit my code to svn, whats
the best way to go about it? I know I have to discuss my design here which
would take atleast a couple of days. But is there special instructions that
I need to follow in order to stay in a direction from where I could commit
my code?


@John
Yes, thats definitely a solution but then I dont want to make two different
http requests. I want to make 1 request and all that I mentioned has to
happen.



On Mon, Apr 2, 2012 at 7:28 PM, Erick Erickson <er...@gmail.com>wrote:

> Part of it depends on what you mean by "threshold". If it's
> just the number of matches, then fine. But if you're talking score
> here, be very, very careful. Scores are not an absolute measure
> of anything, they only tell you that "for _this_ query, the docs
> should be order this way".
>
> So I'd advise against any "query chain" based on scores
> as the threshold, if that's what you mean by "threshold".
>
> Best
> Erick
>
> On Mon, Apr 2, 2012 at 10:28 AM, Karthick Duraisamy Soundararaj
> <ka...@gmail.com> wrote:
> > Hi all,
> >        I am finding a need to merge the results of multiple queries to
> > accomplish a functionality similar to this :
> >
> >                     1. Make query 1
> >                     2. If results returned by query1 is less than a
> > certain threshold, then Make query 2
> >
> > Extending this idea, I want to be able to create a query chain, i.e,
> > provide a functionality where you could specify n queries and n-1
> > thresholds in a single url. Start querying in the order from 1 to n until
> > one of them produces results that exceed the threshold.
> >
> > PS: These n queries and n threshold are passed on a single url and each
> of
> > them could use different request handlers and therefore take a different
> > set of parameters.
> >
> > Any suggestions/thoughts/pointers as where to begin looking for will be
> of
> > great help!
> >
> > Thanks,
> > Karthick
>

Re: Merging results from two queries

Posted by Erick Erickson <er...@gmail.com>.
Part of it depends on what you mean by "threshold". If it's
just the number of matches, then fine. But if you're talking score
here, be very, very careful. Scores are not an absolute measure
of anything, they only tell you that "for _this_ query, the docs
should be order this way".

So I'd advise against any "query chain" based on scores
as the threshold, if that's what you mean by "threshold".

Best
Erick

On Mon, Apr 2, 2012 at 10:28 AM, Karthick Duraisamy Soundararaj
<ka...@gmail.com> wrote:
> Hi all,
>        I am finding a need to merge the results of multiple queries to
> accomplish a functionality similar to this :
>
>                     1. Make query 1
>                     2. If results returned by query1 is less than a
> certain threshold, then Make query 2
>
> Extending this idea, I want to be able to create a query chain, i.e,
> provide a functionality where you could specify n queries and n-1
> thresholds in a single url. Start querying in the order from 1 to n until
> one of them produces results that exceed the threshold.
>
> PS: These n queries and n threshold are passed on a single url and each of
> them could use different request handlers and therefore take a different
> set of parameters.
>
> Any suggestions/thoughts/pointers as where to begin looking for will be of
> great help!
>
> Thanks,
> Karthick

Re: Merging results from two queries

Posted by John Chee <ch...@gmail.com>.
Karthick,

The solution that I use to this problem is to perform query1 and
query2 and boost results matching query1. Then solr takes care of all
the deduplication (not necessarily merging) automatically, would this
work for your situation?

I stole this idea from this slide deck:

"Make sure all relevant documents match... Make sure the best matching
documents score highest..." --
http://www.lucidimagination.com/files/relevancy-ranking-meetup-presentation-14-dec-10.pptx
(page 19)

On Mon, Apr 2, 2012 at 7:28 AM, Karthick Duraisamy Soundararaj
<ka...@gmail.com> wrote:
> Hi all,
>        I am finding a need to merge the results of multiple queries to
> accomplish a functionality similar to this :
>
>                     1. Make query 1
>                     2. If results returned by query1 is less than a
> certain threshold, then Make query 2
>
> Extending this idea, I want to be able to create a query chain, i.e,
> provide a functionality where you could specify n queries and n-1
> thresholds in a single url. Start querying in the order from 1 to n until
> one of them produces results that exceed the threshold.
>
> PS: These n queries and n threshold are passed on a single url and each of
> them could use different request handlers and therefore take a different
> set of parameters.
>
> Any suggestions/thoughts/pointers as where to begin looking for will be of
> great help!
>
> Thanks,
> Karthick