You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Anil Cherian <ch...@gmail.com> on 2009/11/17 20:29:51 UTC

Fwd: solr index-time boost... help required please

Hi,
Sending this mail again after I joined the sol-user group..Kindly find time
to help.
Thanks and Rgds,
Anil
---------- Forwarded message ----------
From: Anil Cherian <ch...@gmail.com>
Date: Fri, Nov 13, 2009 at 3:48 PM
Subject: solr index-time boost... help required please
To: solr-user@lucene.apache.org, solr-dev@lucene.apache.org


Hi,

I have been trying some BOOST functionality in SOLR. I learned how to do
query boosting on a date ie I wanted the latest approval_dt records to come
first in the results.
For that I copied and created a new dismax version called dismax boost in
solrconfig.xml

 <requestHandler name="dismaxboost" class="solr.SearchHandler" >
    <lst name="defaults">
     <str name="defType">dismax</str>
     <str name="echoParams">explicit</str>
     <float name="tie">0.01</float>
     <str name="qf">
        text^0.5
     </str>
     <str name="pf">
        text^0.2
     </str>
     <str name="bf">
         recip(rord(approval_dt),1,1000,1000)^10.3 ord(popularity)^0.5
recip(rord(price),1,1000,1000)^0.3
     </str>
     <str name="fl">*</str>
     <str name="mm">
        2&lt;-1 5&lt;-2 6&lt;90%
     </str>
     <int name="ps">100</int>
     <str name="q.alt">*:*</str>
     <!-- example highlighter config, enable per-query with hl=true -->
     <str name="hl.fl">text features name</str>
     <!-- for this field, we want no fragmenting, just highlighting -->
     <str name="f.name.hl.fragsize">0</str>
     <!-- instructs Solr to return the field itself if no query terms are
          found -->
     <str name="f.name.hl.alternateField">name</str>
     <str name="f.text.hl.fragmenter">regex</str> <!-- defined below -->
    </lst>
  </requestHandler>
Then I just used the following query in SOLR:-
select?qt=dismaxboost&q=horticulture&version=2.2&start=0&rows=10&indent=on&bf=recip(rord(approval_dt),1,1000,1000)^110.0&fl=approval_dt
I got the results with latest dates of approval_dt first. I believe this was
query-time boosting.

Now I am trying *index-time *boosting to improve response time. So i created
an algorithm where I do the following:-
1. sort the records i get from database on approval_dt asc and increase the
boost value of the <field> element for approval_dt by 0.1 as i encounter
higer approval_dt records. If there is no approval_dt for a record, not
boost value for it. I made omitnorms=false in schema.xml for approval_dt
field. Now when I apply the same query nothing special happens ie I dont
even see the latest dates first.
I have some doubts like:-
1. Do we need to always use dismax query handler for BOOST  ?
2. If we boost a doc or field in the xml should we again use the bf
parameter with a function to put the boost into effect while querying when
trying index-time boost also?
3. Also can you frame a query for me to see the latest approval_dt coming
first using the index-time boost approach.
4. Does bf function play any role in solrconfig.xml when we plan to use
index-time boost. My understanding is bf is used only for query-time boost.
5. Is it necessary to use bq in case of index time boost.

Could you please find some time to reply at your earliest convenience as i
am stuck on the index-time feature for some time and badly need to proceed.
Your help is much appreciated.

Re: Fwd: solr index-time boost... help required please

Posted by Chris Hostetter <ho...@fucit.org>.
: Now I am trying *index-time *boosting to improve response time. So i created
: an algorithm where I do the following:-
: 1. sort the records i get from database on approval_dt asc and increase the
: boost value of the <field> element for approval_dt by 0.1 as i encounter
: higer approval_dt records. If there is no approval_dt for a record, not
: boost value for it. I made omitnorms=false in schema.xml for approval_dt
: field. Now when I apply the same query nothing special happens ie I dont
: even see the latest dates first.

index time boosting of a field just affects tehfieldNorms for the specific 
field you apply the boost too -- if you don't search on that field (with a 
score based query type), the boost doesn't affect things.  so if you 
applied an index time boost to some field named "approval_dt" then that 
boost isn't going to matter unless you query against the approval_dt field 
-- but if you use something like a range query, the boost still won't 
matter because range queries don't affect the score.

more then likely what you want to do is use a *document* boost instead of 
a field boost .. that way the boost factor gets applied to any field you 
have that includes the norms, so no matter what field you query on the 
boost will get applied.

: 2. If we boost a doc or field in the xml should we again use the bf
: parameter with a function to put the boost into effect while querying when
: trying index-time boost also?

index time boosts and query boosts are completley orthoginal, you can use 
both together, but they don't require (or know) about eachother at all

: 3. Also can you frame a query for me to see the latest approval_dt coming
: first using the index-time boost approach.

not with the setup you've described ... date based queries really won't 
ever look at the norms for the data field (unlessy ou did a term query for 
a very specified date value)

: 4. Does bf function play any role in solrconfig.xml when we plan to use
: index-time boost. My understanding is bf is used only for query-time boost.

you are correct.

: 5. Is it necessary to use bq in case of index time boost.

same answer as #2.


-Hoss