You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dan Davis <da...@gmail.com> on 2015/01/09 19:35:28 UTC

Best way to implement Spotlight of certain results

I have a requirement to spotlight certain results if the query text exactly
matches the title or see reference (indexed by me as alttitle_t).
What that means is that these matching results are shown above the
top-10/20 list with different CSS and fields.   Its like feeling lucky on
google :)

I have considered three ways of implementing this:

   1. Assume that edismax qf/pf will boost these results to be first when
   there is an exact match on these important fields.   The downside then is
   that my relevancy is constrained and I must maintain my configuration with
   title and alttitle_t as top search fields (see XML snippet below).    I may
   have to overweight them to achieve the "always first" criteria.   Another
   less major downside is that I must always return the spotlight summary
   field (for display) and the image to display on each search.   These could
   be got from a database by the id, however, it is convenient to get them
   from Solr.
   2. Issue two searches for every user search, and use a second set of
   parameters (change the search type and fields to search only by exact
   matching a specific string field spottitle_s).   The search for the
   spotlight can then have its own configuration.   The downside here is that
   I am using Django and pysolr for the front-end, and pysolr is both
   synchronous and tied to the requestHandler named "select".   Convention.
   Of course, running in parallel is not a fix-all - running a search takes
   some time, even if run in parallel.
   3. Automate the population of elevate.xml so that all these 959 queries
   are here.   This is probably best, but forces me to restart/reload when
   there are changes to this components.   The elevation can be done through a
   query.

What I'd love to do is to configure the "select" requestHandler to run both
searches and return me both sets of results.   Is there anyway to do that -
apply the same q= parameter to two configured way to run a search?
Something like sub queries?

I suspect that approach 1 will get me through my demo and a brief
evaluation period, but that either approach 2 or 3 will be the winner.

Here's a snippet from my current qf/pf configuration:
      <str name="qf">
        title^100
        alttitle_t^100
        ...
        text
      </str>
      <str name="pf">
        title^1000
        alttitle_t^1000
        ...
        text^10
     </str>

Thanks,

Dan Davis

Re: Best way to implement Spotlight of certain results

Posted by Dan Davis <da...@gmail.com>.
Maybe I can use grouping, but my understanding of the feature is not up to
figuring that out :)

I tried something like

http://localhost:8983/solr/collection/select?q=childhood+cancer&group=on&group.query=childhood+cancer
Because the group.limit=1, I get a single result, and no other results.
If I add group.field=title, then I get each result, in a group of 1
member...

Eric's re-ranking I do understand - I can re-rank the top-N to make sure
the spotlighted result is always first, avoiding the potential problem of
having to overweight the title field.    In practice, I may not ever need
to use the reranking, but its there if I need it.    This is enough,
because it gives me talking points.


On Fri, Jan 9, 2015 at 3:05 PM, Michał B. . <m....@gmail.com> wrote:

> Maybe I understand you badly but I thing that you could use grouping to
> achieve such effect. If you could prepare two group queries one with exact
> match and other, let's say, default than you will be able to extract
> matches from grouping results. i.e (using default solr example collection)
>
>
> http://localhost:8983/solr/collection1/select?q=*:*&group=true&group.query=manu%3A%22Ap+Computer+Inc.%22&group.query=name:Apple%2060%20GB%20iPod%20with%20Video%20Playback%20Black&group.limit=10
>
> this query will return two groups one with exact match second with the rest
> standard results.
>
> Regars,
> Michal
>
>
> 2015-01-09 20:44 GMT+01:00 Erick Erickson <er...@gmail.com>:
>
> > Hmm, I wonder if the RerankingQueryParser might help here?
> > See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
> >
> > Best,
> > Erick
> >
> > On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> > > I have a requirement to spotlight certain results if the query text
> > exactly
> > > matches the title or see reference (indexed by me as alttitle_t).
> > > What that means is that these matching results are shown above the
> > > top-10/20 list with different CSS and fields.   Its like feeling lucky
> on
> > > google :)
> > >
> > > I have considered three ways of implementing this:
> > >
> > >    1. Assume that edismax qf/pf will boost these results to be first
> when
> > >    there is an exact match on these important fields.   The downside
> > then is
> > >    that my relevancy is constrained and I must maintain my
> configuration
> > with
> > >    title and alttitle_t as top search fields (see XML snippet below).
> > I may
> > >    have to overweight them to achieve the "always first" criteria.
> >  Another
> > >    less major downside is that I must always return the spotlight
> summary
> > >    field (for display) and the image to display on each search.   These
> > could
> > >    be got from a database by the id, however, it is convenient to get
> > them
> > >    from Solr.
> > >    2. Issue two searches for every user search, and use a second set of
> > >    parameters (change the search type and fields to search only by
> exact
> > >    matching a specific string field spottitle_s).   The search for the
> > >    spotlight can then have its own configuration.   The downside here
> is
> > that
> > >    I am using Django and pysolr for the front-end, and pysolr is both
> > >    synchronous and tied to the requestHandler named "select".
> >  Convention.
> > >    Of course, running in parallel is not a fix-all - running a search
> > takes
> > >    some time, even if run in parallel.
> > >    3. Automate the population of elevate.xml so that all these 959
> > queries
> > >    are here.   This is probably best, but forces me to restart/reload
> > when
> > >    there are changes to this components.   The elevation can be done
> > through a
> > >    query.
> > >
> > > What I'd love to do is to configure the "select" requestHandler to run
> > both
> > > searches and return me both sets of results.   Is there anyway to do
> > that -
> > > apply the same q= parameter to two configured way to run a search?
> > > Something like sub queries?
> > >
> > > I suspect that approach 1 will get me through my demo and a brief
> > > evaluation period, but that either approach 2 or 3 will be the winner.
> > >
> > > Here's a snippet from my current qf/pf configuration:
> > >       <str name="qf">
> > >         title^100
> > >         alttitle_t^100
> > >         ...
> > >         text
> > >       </str>
> > >       <str name="pf">
> > >         title^1000
> > >         alttitle_t^1000
> > >         ...
> > >         text^10
> > >      </str>
> > >
> > > Thanks,
> > >
> > > Dan Davis
> >
>
>
>
> --
> Michał Bieńkowski
>

Re: Best way to implement Spotlight of certain results

Posted by "Michał B. ." <m....@gmail.com>.
Maybe I understand you badly but I thing that you could use grouping to
achieve such effect. If you could prepare two group queries one with exact
match and other, let's say, default than you will be able to extract
matches from grouping results. i.e (using default solr example collection)

http://localhost:8983/solr/collection1/select?q=*:*&group=true&group.query=manu%3A%22Ap+Computer+Inc.%22&group.query=name:Apple%2060%20GB%20iPod%20with%20Video%20Playback%20Black&group.limit=10

this query will return two groups one with exact match second with the rest
standard results.

Regars,
Michal


2015-01-09 20:44 GMT+01:00 Erick Erickson <er...@gmail.com>:

> Hmm, I wonder if the RerankingQueryParser might help here?
> See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
>
> Best,
> Erick
>
> On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> > I have a requirement to spotlight certain results if the query text
> exactly
> > matches the title or see reference (indexed by me as alttitle_t).
> > What that means is that these matching results are shown above the
> > top-10/20 list with different CSS and fields.   Its like feeling lucky on
> > google :)
> >
> > I have considered three ways of implementing this:
> >
> >    1. Assume that edismax qf/pf will boost these results to be first when
> >    there is an exact match on these important fields.   The downside
> then is
> >    that my relevancy is constrained and I must maintain my configuration
> with
> >    title and alttitle_t as top search fields (see XML snippet below).
> I may
> >    have to overweight them to achieve the "always first" criteria.
>  Another
> >    less major downside is that I must always return the spotlight summary
> >    field (for display) and the image to display on each search.   These
> could
> >    be got from a database by the id, however, it is convenient to get
> them
> >    from Solr.
> >    2. Issue two searches for every user search, and use a second set of
> >    parameters (change the search type and fields to search only by exact
> >    matching a specific string field spottitle_s).   The search for the
> >    spotlight can then have its own configuration.   The downside here is
> that
> >    I am using Django and pysolr for the front-end, and pysolr is both
> >    synchronous and tied to the requestHandler named "select".
>  Convention.
> >    Of course, running in parallel is not a fix-all - running a search
> takes
> >    some time, even if run in parallel.
> >    3. Automate the population of elevate.xml so that all these 959
> queries
> >    are here.   This is probably best, but forces me to restart/reload
> when
> >    there are changes to this components.   The elevation can be done
> through a
> >    query.
> >
> > What I'd love to do is to configure the "select" requestHandler to run
> both
> > searches and return me both sets of results.   Is there anyway to do
> that -
> > apply the same q= parameter to two configured way to run a search?
> > Something like sub queries?
> >
> > I suspect that approach 1 will get me through my demo and a brief
> > evaluation period, but that either approach 2 or 3 will be the winner.
> >
> > Here's a snippet from my current qf/pf configuration:
> >       <str name="qf">
> >         title^100
> >         alttitle_t^100
> >         ...
> >         text
> >       </str>
> >       <str name="pf">
> >         title^1000
> >         alttitle_t^1000
> >         ...
> >         text^10
> >      </str>
> >
> > Thanks,
> >
> > Dan Davis
>



-- 
Michał Bieńkowski

Re: Best way to implement Spotlight of certain results

Posted by Erick Erickson <er...@gmail.com>.
Hmm, I wonder if the RerankingQueryParser might help here?
See: https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking

Best,
Erick

On Fri, Jan 9, 2015 at 10:35 AM, Dan Davis <da...@gmail.com> wrote:
> I have a requirement to spotlight certain results if the query text exactly
> matches the title or see reference (indexed by me as alttitle_t).
> What that means is that these matching results are shown above the
> top-10/20 list with different CSS and fields.   Its like feeling lucky on
> google :)
>
> I have considered three ways of implementing this:
>
>    1. Assume that edismax qf/pf will boost these results to be first when
>    there is an exact match on these important fields.   The downside then is
>    that my relevancy is constrained and I must maintain my configuration with
>    title and alttitle_t as top search fields (see XML snippet below).    I may
>    have to overweight them to achieve the "always first" criteria.   Another
>    less major downside is that I must always return the spotlight summary
>    field (for display) and the image to display on each search.   These could
>    be got from a database by the id, however, it is convenient to get them
>    from Solr.
>    2. Issue two searches for every user search, and use a second set of
>    parameters (change the search type and fields to search only by exact
>    matching a specific string field spottitle_s).   The search for the
>    spotlight can then have its own configuration.   The downside here is that
>    I am using Django and pysolr for the front-end, and pysolr is both
>    synchronous and tied to the requestHandler named "select".   Convention.
>    Of course, running in parallel is not a fix-all - running a search takes
>    some time, even if run in parallel.
>    3. Automate the population of elevate.xml so that all these 959 queries
>    are here.   This is probably best, but forces me to restart/reload when
>    there are changes to this components.   The elevation can be done through a
>    query.
>
> What I'd love to do is to configure the "select" requestHandler to run both
> searches and return me both sets of results.   Is there anyway to do that -
> apply the same q= parameter to two configured way to run a search?
> Something like sub queries?
>
> I suspect that approach 1 will get me through my demo and a brief
> evaluation period, but that either approach 2 or 3 will be the winner.
>
> Here's a snippet from my current qf/pf configuration:
>       <str name="qf">
>         title^100
>         alttitle_t^100
>         ...
>         text
>       </str>
>       <str name="pf">
>         title^1000
>         alttitle_t^1000
>         ...
>         text^10
>      </str>
>
> Thanks,
>
> Dan Davis