You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by John Bickerstaff <jo...@johnbickerstaff.com> on 2016/04/14 20:34:33 UTC

Referencing incoming search terms in searchHandler XML

I have the following (essentially hard-coded) line in the Solr Admin Query
UI

=====
bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
=====

The "searchTerm" entries represent whatever the user typed into the search
box.  This can be one or more words.  Usually less than 5.

I want to put the search parameters I've built in the Admin UI into a
requestHandler.

I think that means I need a like like this in the searchHandler in
solrconfig.xml

=====
<str name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
-->
=====

Am I oversimplifying?

How can I accurately reference the incoming search terms as a "variable" or
parameter in the requestHandler XML?

Is it as simple as $q?  Something more complex?

Is there any choice besides the somewhat arcane local params?  If not, what
is the simplest, most straightforward way to reference incoming query terms
using local params?

Thanks...

Re: Referencing incoming search terms in searchHandler XML

Posted by Walter Underwood <wu...@wunderwood.org>.
> On Apr 14, 2016, at 12:18 PM, John Bickerstaff <jo...@johnbickerstaff.com> wrote:
> 
> If a user types in "foobarbaz figo" I want all documents with "figo" in the
> contentType field boosted above every other document in the results.


This is a very common requirement that seems like a good idea, but has very bad corner cases. I always take this back to the customer and convert it to something that works for all queries.

Think about this query:

   vitamin a figo

Now, every document with the word “a” is ranked in front of documents with “vitamin a”. That is probably what not what the customer wanted.

Instead, have a requirement that when two documents are equal matches for the query, the “figo” document is first.

Or, create an SRP with two sections, five figo matches with a “More …” link, then five general matches. But you might want to avoid dupes between the two.

If your customer absolutely insists on having every single figo doc above non-figo docs, well, they deserve what they get.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


Re: Referencing incoming search terms in searchHandler XML

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
Thanks - so this:

bq=contentType:(original query text here)^1000

is exactly what I want to do to every incoming query via an entry in a
custom requestHandler.  Thus my question about how to reference the
original query text in the requestHandler xml...

I believe that if I want to do that, I'm going to have to use the
simpleparams syntax, yes?

Ideally, it would be as simple as:

<str name="bq">contentType:($q)^1000</str>

... and the requestHandler would recognize $q as a magic variable that
holds the current search text (what came in on q on the URL)



But I'm guessing I can't access the query itself in the requestHandler
without being inside the brackets of a simpleParams, like this:

&bq={! ..... .....  .....  v=$q}^1000

However - as you pointed out, there are other alternatives...

I believe I've found an easier way - at least for this case - which is:

&qf=text contentType^1000

I think this ensures that the "standard" search on the catchall field of
"text" will happen with no boosting and an additional search on contentType
will occur with the boost listed.

Also - thanks for the caveat on the boosting function - I'll set
expectations with my users.  I can't limit my search to only docs with
contentType = figo -- or I could guarantee that they show up - so the boost
seems the best compromise -- unless I want to parse search results in code
- which I'd rather avoid whenever possible.

On Thu, Apr 14, 2016 at 1:41 PM, Erick Erickson <er...@gmail.com>
wrote:

> Right, edismax is where I'd start. NOTE: there are about a zillion
> options here so you may find yourself lost in a bit of a maze for
> a while, but it's usually faster than coding it yourself ;).
>
> In this case, take a look at the "bq" parameter to edismax and
> make it something like bq=contentType:(original query text here)^1000
>
> In short, it's likely that someone has had this problem before and
> there's a solution, said solution may not be easy to find though ;(
>
> And also note that boosting is not definitive. By that I mean that
> boosting just influences the score it does _not_ explicitly order the
> results. So the docs with "figo" in the conentType field will tend to
> the top, but won't be absolutely guaranteed to be there.
>
>
>
> Best,
> Erick
>
> On Thu, Apr 14, 2016 at 12:18 PM, John Bickerstaff
> <jo...@johnbickerstaff.com> wrote:
> > OK - that's interesting.  Perhaps I'm thinking too much like a developer
> > and just want to be able to reach into context and grab anything any
> time I
> > want...  Thanks for the input...
> >
> > =====
> >
> > To clarify, I want to boost the document's score if the user enters a
> term
> > found in the contentType field.
> >
> > As an example, the term "figo" is one of a few that are stored in the
> > contentType field.  It's not a multivalued field - one entry per
> document.
> >
> > If a user types in "foobarbaz figo" I want all documents with "figo" in
> the
> > contentType field boosted above every other document in the results.  The
> > order of docs can be determined by the other scores - my user's rule is
> > simply that any with "figo" in contentType should be appear above any
> which
> > do NOT have "figo" in that field.
> >
> > I can't know when the users will type any of the "magic" contentType
> terms
> > into the search, so I think I have to run the search every time against
> the
> > contentType field.
> >
> > So - that's my underlying use case - and as I say, I'm beginning to think
> > the edismax setting of qf= text contentType^1000 answers my need really
> > well -- and is easier.  A quick test looks like I'm getting the results I
> > expect...
> >
> >
> >
> >
> >
> > On Thu, Apr 14, 2016 at 1:02 PM, Erick Erickson <erickerickson@gmail.com
> >
> > wrote:
> >
> >> You really don't do that in solrconfig.xml.
> >>
> >> This seems like an XY problem. You're trying
> >> to solve some particular use-case and accessing the
> >> terms in solrconfig.xml. You've already found the ability
> >> to configure edismax as your defType and apply boosts
> >> to particular fields...
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Apr 14, 2016 at 11:53 AM, John Bickerstaff
> >> <jo...@johnbickerstaff.com> wrote:
> >> > Maybe I'm overdoing it...
> >> >
> >> > It seems to me that qf= text contentType^1000 would do this for me
> more
> >> > easily - as it appears to assume the incoming search terms...
> >> >
> >> > However, I'd still like to know the simplest way to reference the
> search
> >> > terms in the XML - or possibly get a URL that points the way.
> >> >
> >> > Thanks.
> >> >
> >> > On Thu, Apr 14, 2016 at 12:34 PM, John Bickerstaff <
> >> john@johnbickerstaff.com
> >> >> wrote:
> >> >
> >> >> I have the following (essentially hard-coded) line in the Solr Admin
> >> Query
> >> >> UI
> >> >>
> >> >> =====
> >> >> bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
> >> >> =====
> >> >>
> >> >> The "searchTerm" entries represent whatever the user typed into the
> >> search
> >> >> box.  This can be one or more words.  Usually less than 5.
> >> >>
> >> >> I want to put the search parameters I've built in the Admin UI into a
> >> >> requestHandler.
> >> >>
> >> >> I think that means I need a like like this in the searchHandler in
> >> >> solrconfig.xml
> >> >>
> >> >> =====
> >> >> <str
> >> name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
> >> >> -->
> >> >> =====
> >> >>
> >> >> Am I oversimplifying?
> >> >>
> >> >> How can I accurately reference the incoming search terms as a
> "variable"
> >> >> or parameter in the requestHandler XML?
> >> >>
> >> >> Is it as simple as $q?  Something more complex?
> >> >>
> >> >> Is there any choice besides the somewhat arcane local params?  If
> not,
> >> >> what is the simplest, most straightforward way to reference incoming
> >> query
> >> >> terms using local params?
> >> >>
> >> >> Thanks...
> >> >>
> >>
>

Re: Referencing incoming search terms in searchHandler XML

Posted by Erick Erickson <er...@gmail.com>.
Right, edismax is where I'd start. NOTE: there are about a zillion
options here so you may find yourself lost in a bit of a maze for
a while, but it's usually faster than coding it yourself ;).

In this case, take a look at the "bq" parameter to edismax and
make it something like bq=contentType:(original query text here)^1000

In short, it's likely that someone has had this problem before and
there's a solution, said solution may not be easy to find though ;(

And also note that boosting is not definitive. By that I mean that
boosting just influences the score it does _not_ explicitly order the
results. So the docs with "figo" in the conentType field will tend to
the top, but won't be absolutely guaranteed to be there.



Best,
Erick

On Thu, Apr 14, 2016 at 12:18 PM, John Bickerstaff
<jo...@johnbickerstaff.com> wrote:
> OK - that's interesting.  Perhaps I'm thinking too much like a developer
> and just want to be able to reach into context and grab anything any time I
> want...  Thanks for the input...
>
> =====
>
> To clarify, I want to boost the document's score if the user enters a term
> found in the contentType field.
>
> As an example, the term "figo" is one of a few that are stored in the
> contentType field.  It's not a multivalued field - one entry per document.
>
> If a user types in "foobarbaz figo" I want all documents with "figo" in the
> contentType field boosted above every other document in the results.  The
> order of docs can be determined by the other scores - my user's rule is
> simply that any with "figo" in contentType should be appear above any which
> do NOT have "figo" in that field.
>
> I can't know when the users will type any of the "magic" contentType terms
> into the search, so I think I have to run the search every time against the
> contentType field.
>
> So - that's my underlying use case - and as I say, I'm beginning to think
> the edismax setting of qf= text contentType^1000 answers my need really
> well -- and is easier.  A quick test looks like I'm getting the results I
> expect...
>
>
>
>
>
> On Thu, Apr 14, 2016 at 1:02 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> You really don't do that in solrconfig.xml.
>>
>> This seems like an XY problem. You're trying
>> to solve some particular use-case and accessing the
>> terms in solrconfig.xml. You've already found the ability
>> to configure edismax as your defType and apply boosts
>> to particular fields...
>>
>> Best,
>> Erick
>>
>> On Thu, Apr 14, 2016 at 11:53 AM, John Bickerstaff
>> <jo...@johnbickerstaff.com> wrote:
>> > Maybe I'm overdoing it...
>> >
>> > It seems to me that qf= text contentType^1000 would do this for me more
>> > easily - as it appears to assume the incoming search terms...
>> >
>> > However, I'd still like to know the simplest way to reference the search
>> > terms in the XML - or possibly get a URL that points the way.
>> >
>> > Thanks.
>> >
>> > On Thu, Apr 14, 2016 at 12:34 PM, John Bickerstaff <
>> john@johnbickerstaff.com
>> >> wrote:
>> >
>> >> I have the following (essentially hard-coded) line in the Solr Admin
>> Query
>> >> UI
>> >>
>> >> =====
>> >> bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
>> >> =====
>> >>
>> >> The "searchTerm" entries represent whatever the user typed into the
>> search
>> >> box.  This can be one or more words.  Usually less than 5.
>> >>
>> >> I want to put the search parameters I've built in the Admin UI into a
>> >> requestHandler.
>> >>
>> >> I think that means I need a like like this in the searchHandler in
>> >> solrconfig.xml
>> >>
>> >> =====
>> >> <str
>> name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
>> >> -->
>> >> =====
>> >>
>> >> Am I oversimplifying?
>> >>
>> >> How can I accurately reference the incoming search terms as a "variable"
>> >> or parameter in the requestHandler XML?
>> >>
>> >> Is it as simple as $q?  Something more complex?
>> >>
>> >> Is there any choice besides the somewhat arcane local params?  If not,
>> >> what is the simplest, most straightforward way to reference incoming
>> query
>> >> terms using local params?
>> >>
>> >> Thanks...
>> >>
>>

Re: Referencing incoming search terms in searchHandler XML

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
OK - that's interesting.  Perhaps I'm thinking too much like a developer
and just want to be able to reach into context and grab anything any time I
want...  Thanks for the input...

=====

To clarify, I want to boost the document's score if the user enters a term
found in the contentType field.

As an example, the term "figo" is one of a few that are stored in the
contentType field.  It's not a multivalued field - one entry per document.

If a user types in "foobarbaz figo" I want all documents with "figo" in the
contentType field boosted above every other document in the results.  The
order of docs can be determined by the other scores - my user's rule is
simply that any with "figo" in contentType should be appear above any which
do NOT have "figo" in that field.

I can't know when the users will type any of the "magic" contentType terms
into the search, so I think I have to run the search every time against the
contentType field.

So - that's my underlying use case - and as I say, I'm beginning to think
the edismax setting of qf= text contentType^1000 answers my need really
well -- and is easier.  A quick test looks like I'm getting the results I
expect...





On Thu, Apr 14, 2016 at 1:02 PM, Erick Erickson <er...@gmail.com>
wrote:

> You really don't do that in solrconfig.xml.
>
> This seems like an XY problem. You're trying
> to solve some particular use-case and accessing the
> terms in solrconfig.xml. You've already found the ability
> to configure edismax as your defType and apply boosts
> to particular fields...
>
> Best,
> Erick
>
> On Thu, Apr 14, 2016 at 11:53 AM, John Bickerstaff
> <jo...@johnbickerstaff.com> wrote:
> > Maybe I'm overdoing it...
> >
> > It seems to me that qf= text contentType^1000 would do this for me more
> > easily - as it appears to assume the incoming search terms...
> >
> > However, I'd still like to know the simplest way to reference the search
> > terms in the XML - or possibly get a URL that points the way.
> >
> > Thanks.
> >
> > On Thu, Apr 14, 2016 at 12:34 PM, John Bickerstaff <
> john@johnbickerstaff.com
> >> wrote:
> >
> >> I have the following (essentially hard-coded) line in the Solr Admin
> Query
> >> UI
> >>
> >> =====
> >> bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
> >> =====
> >>
> >> The "searchTerm" entries represent whatever the user typed into the
> search
> >> box.  This can be one or more words.  Usually less than 5.
> >>
> >> I want to put the search parameters I've built in the Admin UI into a
> >> requestHandler.
> >>
> >> I think that means I need a like like this in the searchHandler in
> >> solrconfig.xml
> >>
> >> =====
> >> <str
> name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
> >> -->
> >> =====
> >>
> >> Am I oversimplifying?
> >>
> >> How can I accurately reference the incoming search terms as a "variable"
> >> or parameter in the requestHandler XML?
> >>
> >> Is it as simple as $q?  Something more complex?
> >>
> >> Is there any choice besides the somewhat arcane local params?  If not,
> >> what is the simplest, most straightforward way to reference incoming
> query
> >> terms using local params?
> >>
> >> Thanks...
> >>
>

Re: Referencing incoming search terms in searchHandler XML

Posted by Erick Erickson <er...@gmail.com>.
You really don't do that in solrconfig.xml.

This seems like an XY problem. You're trying
to solve some particular use-case and accessing the
terms in solrconfig.xml. You've already found the ability
to configure edismax as your defType and apply boosts
to particular fields...

Best,
Erick

On Thu, Apr 14, 2016 at 11:53 AM, John Bickerstaff
<jo...@johnbickerstaff.com> wrote:
> Maybe I'm overdoing it...
>
> It seems to me that qf= text contentType^1000 would do this for me more
> easily - as it appears to assume the incoming search terms...
>
> However, I'd still like to know the simplest way to reference the search
> terms in the XML - or possibly get a URL that points the way.
>
> Thanks.
>
> On Thu, Apr 14, 2016 at 12:34 PM, John Bickerstaff <john@johnbickerstaff.com
>> wrote:
>
>> I have the following (essentially hard-coded) line in the Solr Admin Query
>> UI
>>
>> =====
>> bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
>> =====
>>
>> The "searchTerm" entries represent whatever the user typed into the search
>> box.  This can be one or more words.  Usually less than 5.
>>
>> I want to put the search parameters I've built in the Admin UI into a
>> requestHandler.
>>
>> I think that means I need a like like this in the searchHandler in
>> solrconfig.xml
>>
>> =====
>> <str name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
>> -->
>> =====
>>
>> Am I oversimplifying?
>>
>> How can I accurately reference the incoming search terms as a "variable"
>> or parameter in the requestHandler XML?
>>
>> Is it as simple as $q?  Something more complex?
>>
>> Is there any choice besides the somewhat arcane local params?  If not,
>> what is the simplest, most straightforward way to reference incoming query
>> terms using local params?
>>
>> Thanks...
>>

Re: Referencing incoming search terms in searchHandler XML

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
Maybe I'm overdoing it...

It seems to me that qf= text contentType^1000 would do this for me more
easily - as it appears to assume the incoming search terms...

However, I'd still like to know the simplest way to reference the search
terms in the XML - or possibly get a URL that points the way.

Thanks.

On Thu, Apr 14, 2016 at 12:34 PM, John Bickerstaff <john@johnbickerstaff.com
> wrote:

> I have the following (essentially hard-coded) line in the Solr Admin Query
> UI
>
> =====
> bq: contentType:(searchTerm1 searchTerm2 searchTerm2)^1000
> =====
>
> The "searchTerm" entries represent whatever the user typed into the search
> box.  This can be one or more words.  Usually less than 5.
>
> I want to put the search parameters I've built in the Admin UI into a
> requestHandler.
>
> I think that means I need a like like this in the searchHandler in
> solrconfig.xml
>
> =====
> <str name="bq">contentType:(magic_reference_to_incoming_search)^1000</str>
> -->
> =====
>
> Am I oversimplifying?
>
> How can I accurately reference the incoming search terms as a "variable"
> or parameter in the requestHandler XML?
>
> Is it as simple as $q?  Something more complex?
>
> Is there any choice besides the somewhat arcane local params?  If not,
> what is the simplest, most straightforward way to reference incoming query
> terms using local params?
>
> Thanks...
>