You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dan Davis <da...@gmail.com> on 2015/01/09 19:35:28 UTC
Best way to implement Spotlight of certain results
I have a requirement to spotlight certain results if the query text exactly
matches the title or see reference (indexed by me as alttitle_t).
What that means is that these matching results are shown above the
top-10/20 list with different CSS and fields. Its like feeling lucky on
google :)
I have considered three ways of implementing this:
1. Assume that edismax qf/pf will boost these results to be first when
there is an exact match on these important fields. The downside then is
that my relevancy is constrained and I must maintain my configuration with
title and alttitle_t as top search fields (see XML snippet below). I may
have to overweight them to achieve the "always first" criteria. Another
less major downside is that I must always return the spotlight summary
field (for display) and the image to display on each search. These could
be got from a database by the id, however, it is convenient to get them
from Solr.
2. Issue two searches for every user search, and use a second set of
parameters (change the search type and fields to search only by exact
matching a specific string field spottitle_s). The search for the
spotlight can then have its own configuration. The downside here is that
I am using Django and pysolr for the front-end, and pysolr is both
synchronous and tied to the requestHandler named "select". Convention.
Of course, running in parallel is not a fix-all - running a search takes
some time, even if run in parallel.
3. Automate the population of elevate.xml so that all these 959 queries
are here. This is probably best, but forces me to restart/reload when
there are changes to this components. The elevation can be done through a
query.
What I'd love to do is to configure the "select" requestHandler to run both
searches and return me both sets of results. Is there anyway to do that -
apply the same q= parameter to two configured way to run a search?
Something like sub queries?
I suspect that approach 1 will get me through my demo and a brief
evaluation period, but that either approach 2 or 3 will be the winner.
Here's a snippet from my current qf/pf configuration:
<str name="qf">
title^100
alttitle_t^100
...
text
</str>
<str name="pf">
title^1000
alttitle_t^1000
...
text^10
</str>
Thanks,
Dan Davis
Re: Best way to implement Spotlight of certain results
Posted by Dan Davis <da...@gmail.com>.
Maybe I can use grouping, but my understanding of the feature is not up to
figuring that out :)
I tried something like
http://localhost:8983/solr/collection/select?q=childhood+cancer&group=on&group.query=childhood+cancer
Because the group.limit=1, I get a single result, and no other results.
If I add group.field=title, then I get each result, in a group of 1
member...
Eric's re-ranking I do understand - I can re-rank the top-N to make sure
the spotlighted result is always first, avoiding the potential problem of
having to overweight the title field. In practice, I may not ever need
to use the reranking, but its there if I need it. This is enough,
because it gives me talking points.
On Fri, Jan 9, 2015 at 3:05 PM, Michał B. . <m....@gmail.com> wrote:
> Maybe I understand you badly but I thing that you could use grouping to
> achieve such effect. If you could prepare two group queries one with exact
> match and other, let's say, default than you will be able to extract
> matches from grouping results. i.e (using default solr example collection)
>
>
> http://localhost:8983/solr/collection1/select?q=*:*&group=true&group.query=manu%3A%22Ap+Computer+Inc.%22&group.query=name:Apple%2060%20GB%20iPod%20with%20Video%20Playback%20Black&group.limit=10
>
> this query will return two groups one with exact match second with the rest
> standard results.
>
> Regars,
> Michal
>
>
> 2015-01-09 20:44 GMT+01:00 Erick Erickson <er...@gmail.com>:
>
> > Hmm, I wonder if the RerankingQueryParser might help here?
> > See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
> >
> > Best,
> > Erick
> >
> > On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> > > I have a requirement to spotlight certain results if the query text
> > exactly
> > > matches the title or see reference (indexed by me as alttitle_t).
> > > What that means is that these matching results are shown above the
> > > top-10/20 list with different CSS and fields. Its like feeling lucky
> on
> > > google :)
> > >
> > > I have considered three ways of implementing this:
> > >
> > > 1. Assume that edismax qf/pf will boost these results to be first
> when
> > > there is an exact match on these important fields. The downside
> > then is
> > > that my relevancy is constrained and I must maintain my
> configuration
> > with
> > > title and alttitle_t as top search fields (see XML snippet below).
> > I may
> > > have to overweight them to achieve the "always first" criteria.
> > Another
> > > less major downside is that I must always return the spotlight
> summary
> > > field (for display) and the image to display on each search. These
> > could
> > > be got from a database by the id, however, it is convenient to get
> > them
> > > from Solr.
> > > 2. Issue two searches for every user search, and use a second set of
> > > parameters (change the search type and fields to search only by
> exact
> > > matching a specific string field spottitle_s). The search for the
> > > spotlight can then have its own configuration. The downside here
> is
> > that
> > > I am using Django and pysolr for the front-end, and pysolr is both
> > > synchronous and tied to the requestHandler named "select".
> > Convention.
> > > Of course, running in parallel is not a fix-all - running a search
> > takes
> > > some time, even if run in parallel.
> > > 3. Automate the population of elevate.xml so that all these 959
> > queries
> > > are here. This is probably best, but forces me to restart/reload
> > when
> > > there are changes to this components. The elevation can be done
> > through a
> > > query.
> > >
> > > What I'd love to do is to configure the "select" requestHandler to run
> > both
> > > searches and return me both sets of results. Is there anyway to do
> > that -
> > > apply the same q= parameter to two configured way to run a search?
> > > Something like sub queries?
> > >
> > > I suspect that approach 1 will get me through my demo and a brief
> > > evaluation period, but that either approach 2 or 3 will be the winner.
> > >
> > > Here's a snippet from my current qf/pf configuration:
> > > <str name="qf">
> > > title^100
> > > alttitle_t^100
> > > ...
> > > text
> > > </str>
> > > <str name="pf">
> > > title^1000
> > > alttitle_t^1000
> > > ...
> > > text^10
> > > </str>
> > >
> > > Thanks,
> > >
> > > Dan Davis
> >
>
>
>
> --
> Michał Bieńkowski
>
Re: Best way to implement Spotlight of certain results
Posted by "Michał B. ." <m....@gmail.com>.
Maybe I understand you badly but I thing that you could use grouping to
achieve such effect. If you could prepare two group queries one with exact
match and other, let's say, default than you will be able to extract
matches from grouping results. i.e (using default solr example collection)
http://localhost:8983/solr/collection1/select?q=*:*&group=true&group.query=manu%3A%22Ap+Computer+Inc.%22&group.query=name:Apple%2060%20GB%20iPod%20with%20Video%20Playback%20Black&group.limit=10
this query will return two groups one with exact match second with the rest
standard results.
Regars,
Michal
2015-01-09 20:44 GMT+01:00 Erick Erickson <er...@gmail.com>:
> Hmm, I wonder if the RerankingQueryParser might help here?
> See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
>
> Best,
> Erick
>
> On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> > I have a requirement to spotlight certain results if the query text
> exactly
> > matches the title or see reference (indexed by me as alttitle_t).
> > What that means is that these matching results are shown above the
> > top-10/20 list with different CSS and fields. Its like feeling lucky on
> > google :)
> >
> > I have considered three ways of implementing this:
> >
> > 1. Assume that edismax qf/pf will boost these results to be first when
> > there is an exact match on these important fields. The downside
> then is
> > that my relevancy is constrained and I must maintain my configuration
> with
> > title and alttitle_t as top search fields (see XML snippet below).
> I may
> > have to overweight them to achieve the "always first" criteria.
> Another
> > less major downside is that I must always return the spotlight summary
> > field (for display) and the image to display on each search. These
> could
> > be got from a database by the id, however, it is convenient to get
> them
> > from Solr.
> > 2. Issue two searches for every user search, and use a second set of
> > parameters (change the search type and fields to search only by exact
> > matching a specific string field spottitle_s). The search for the
> > spotlight can then have its own configuration. The downside here is
> that
> > I am using Django and pysolr for the front-end, and pysolr is both
> > synchronous and tied to the requestHandler named "select".
> Convention.
> > Of course, running in parallel is not a fix-all - running a search
> takes
> > some time, even if run in parallel.
> > 3. Automate the population of elevate.xml so that all these 959
> queries
> > are here. This is probably best, but forces me to restart/reload
> when
> > there are changes to this components. The elevation can be done
> through a
> > query.
> >
> > What I'd love to do is to configure the "select" requestHandler to run
> both
> > searches and return me both sets of results. Is there anyway to do
> that -
> > apply the same q= parameter to two configured way to run a search?
> > Something like sub queries?
> >
> > I suspect that approach 1 will get me through my demo and a brief
> > evaluation period, but that either approach 2 or 3 will be the winner.
> >
> > Here's a snippet from my current qf/pf configuration:
> > <str name="qf">
> > title^100
> > alttitle_t^100
> > ...
> > text
> > </str>
> > <str name="pf">
> > title^1000
> > alttitle_t^1000
> > ...
> > text^10
> > </str>
> >
> > Thanks,
> >
> > Dan Davis
>
--
Michał Bieńkowski
Re: Best way to implement Spotlight of certain results
Posted by Erick Erickson <er...@gmail.com>.
Hmm, I wonder if the RerankingQueryParser might help here?
See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
Best,
Erick
On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> I have a requirement to spotlight certain results if the query text exactly
> matches the title or see reference (indexed by me as alttitle_t).
> What that means is that these matching results are shown above the
> top-10/20 list with different CSS and fields. Its like feeling lucky on
> google :)
>
> I have considered three ways of implementing this:
>
> 1. Assume that edismax qf/pf will boost these results to be first when
> there is an exact match on these important fields. The downside then is
> that my relevancy is constrained and I must maintain my configuration with
> title and alttitle_t as top search fields (see XML snippet below). I may
> have to overweight them to achieve the "always first" criteria. Another
> less major downside is that I must always return the spotlight summary
> field (for display) and the image to display on each search. These could
> be got from a database by the id, however, it is convenient to get them
> from Solr.
> 2. Issue two searches for every user search, and use a second set of
> parameters (change the search type and fields to search only by exact
> matching a specific string field spottitle_s). The search for the
> spotlight can then have its own configuration. The downside here is that
> I am using Django and pysolr for the front-end, and pysolr is both
> synchronous and tied to the requestHandler named "select". Convention.
> Of course, running in parallel is not a fix-all - running a search takes
> some time, even if run in parallel.
> 3. Automate the population of elevate.xml so that all these 959 queries
> are here. This is probably best, but forces me to restart/reload when
> there are changes to this components. The elevation can be done through a
> query.
>
> What I'd love to do is to configure the "select" requestHandler to run both
> searches and return me both sets of results. Is there anyway to do that -
> apply the same q= parameter to two configured way to run a search?
> Something like sub queries?
>
> I suspect that approach 1 will get me through my demo and a brief
> evaluation period, but that either approach 2 or 3 will be the winner.
>
> Here's a snippet from my current qf/pf configuration:
> <str name="qf">
> title^100
> alttitle_t^100
> ...
> text
> </str>
> <str name="pf">
> title^1000
> alttitle_t^1000
> ...
> text^10
> </str>
>
> Thanks,
>
> Dan Davis