You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Solr User <so...@gmail.com> on 2010/11/15 18:46:48 UTC

Dismax - Boosting

Hi,

Currently we are using StandardRequestHandler and the configuration in
SolrConfig.xml is as below:

  <requestHandler name="standard" class="solr.SearchHandler" default="true">
    <!-- default values for query parameters -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <!--
       <int name="rows">10</int>
       <str name="fl">*</str>
       <str name="version">2.1</str>
        -->
     </lst>
  </requestHandler>


We would like to switch to DisMax request handler and the configuration in
SolrConfig.xml is:

  <requestHandler name="dismax" class="solr.SearchHandler" >
    <lst name="defaults">
     <str name="defType">dismax</str>
     <str name="echoParams">explicit</str>
     <float name="tie">0.01</float>
     <str name="qf">
        text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
     </str>
     <str name="pf">
        text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
     </str>
     <str name="bf">
        popularity^0.5 recip(price,1,1000,1000)^0.3
     </str>
     <str name="fl">
        id,name,price,score
     </str>
     <str name="mm">
        2&lt;-1 5&lt;-2 6&lt;90%
     </str>
     <int name="ps">100</int>
     <str name="q.alt">*:*</str>
     <!-- example highlighter config, enable per-query with hl=true -->
     <str name="hl.fl">text features name</str>
     <!-- for this field, we want no fragmenting, just highlighting -->
     <str name="f.name.hl.fragsize">0</str>
     <!-- instructs Solr to return the field itself if no query terms are
          found -->
     <str name="f.name.hl.alternateField">name</str>
     <str name="f.text.hl.fragmenter">regex</str> <!-- defined below -->
    </lst>
  </requestHandler>

Questions:

1. Do we need to change the above DisMax handler configuration as per our
requirements? Or Leave it as it is? What changes?
2. Do we need make DisMax as a default request handler?  Do I need to add
attribute default="true" to the tag?
3. I read in the documentation that Default Search Handler and DisMax are
the same except that to use DisMaxQueryParser add defType=dismax in the
query string. Is there anything else do we need to do?

We are basically moving on to dismax handler and trying to understand what
changes we need to make to SolrConfig.xml. I understood what changes need to
be made to schema.xml in a different thread on this forum.

Thanks,
Solr User

Re: Dismax - Boosting

Posted by Ahmet Arslan <io...@yahoo.com>.
> In the past we used /spell and if there is not match then
> we use to get a
> list of suggestions and then we use to make another call
> with the first
> suggestion to get search results. After that we show user
> both suggestions
> for the spelling mistake and results of the first
> suggestion.
> 
> I think the URL that you provided which has plug in will do
> help doing that.

Yes, it does exactly about what you describe.

> Is there a way from Solr to directly get the spelling
> suggestions as well as
> first suggestion data at the same time?

You can't do that in one step with out-of-the-box solr. You need a plugin for that.



      

Re: Dismax - Boosting

Posted by Solr User <so...@gmail.com>.
Hi Ahmet,

In the past we used /spell and if there is not match then we use to get a
list of suggestions and then we use to make another call with the first
suggestion to get search results. After that we show user both suggestions
for the spelling mistake and results of the first suggestion.

I think the URL that you provided which has plug in will do help doing that.

Is there a way from Solr to directly get the spelling suggestions as well as
first suggestion data at the same time?

For example:

if seach keywork is mooon (typed by mistake instead of moon)

the we need all suggestions like:

Did you mean:  moon, mooooo, mooing, moonen, soon, mood, moose, moore,
spoon, moons?

and also the search results for the first suggestion moon.

Thanks,
Solr User

On Fri, Nov 19, 2010 at 6:41 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > The below is my previous configuration which use to work
> > correctly.
> >
> > <searchComponent name="spellcheck"
> > class="solr.SpellCheckComponent">
> >  <str
> > name="queryAnalyzerFieldType">textSpell</str>
> >  <lst name="spellchecker">
> >   <str name="name">default</str>
> >   <str name="field">searchFields</str>
> >   <str
> > name="spellcheckIndexDir">/solr/qa/tradedata/spellchecker</str>
> >   <str name="buildOnCommit">true</str>
> >  </lst>
> > </searchComponent>
> >
> > We use to search only in one field which is "searchFields"
> > but with
> > implementing dismax we are searching in different fields
> > like
> >
> > title^9.0 subtitle^3.0 author^2.0 desc shortdesc imprint
> > category isbn13
> > isbn10 format series season bisacsub award.
> >
> > Do we need to modify the above configuration to include all
> > the above
> > fields:??? Please give me an example.
>
> Searching and spell checking are independent. For example you can search on
> 10 fields, and create suggestions from 2 fields. Spell checker accepts one
> field in its configuration. So you need to populate this field with
> copyField. Using the fields that you want to use spell checking. And type of
> this field should be textSpell in your case. You can use above config.
>
> >
> > In the past we use to query twice to get first the
> > suggestions and then we
> > use to query using the first suggestion to show the data.
> >
> > Is there a way that we can do it in one step?
>
> Are you talking about queries that return 0 numFound? Re-executing the
> search like, described here
> http://sematext.com/products/dym-researcher/index.html
>
> Not out-of-the-box.
>
>
>
>

Re: Dismax - Boosting

Posted by Ahmet Arslan <io...@yahoo.com>.
> The below is my previous configuration which use to work
> correctly.
> 
> <searchComponent name="spellcheck"
> class="solr.SpellCheckComponent">
>  <str
> name="queryAnalyzerFieldType">textSpell</str>
>  <lst name="spellchecker">
>   <str name="name">default</str>
>   <str name="field">searchFields</str>
>   <str
> name="spellcheckIndexDir">/solr/qa/tradedata/spellchecker</str>
>   <str name="buildOnCommit">true</str>
>  </lst>
> </searchComponent>
> 
> We use to search only in one field which is "searchFields"
> but with
> implementing dismax we are searching in different fields
> like
> 
> title^9.0 subtitle^3.0 author^2.0 desc shortdesc imprint
> category isbn13
> isbn10 format series season bisacsub award.
> 
> Do we need to modify the above configuration to include all
> the above
> fields:??? Please give me an example.

Searching and spell checking are independent. For example you can search on 10 fields, and create suggestions from 2 fields. Spell checker accepts one field in its configuration. So you need to populate this field with copyField. Using the fields that you want to use spell checking. And type of this field should be textSpell in your case. You can use above config.

> 
> In the past we use to query twice to get first the
> suggestions and then we
> use to query using the first suggestion to show the data.
> 
> Is there a way that we can do it in one step?

Are you talking about queries that return 0 numFound? Re-executing the search like, described here http://sematext.com/products/dym-researcher/index.html

Not out-of-the-box.


      

Re: Dismax - Boosting

Posted by Solr User <so...@gmail.com>.
Hi Ahmet,

The below is my previous configuration which use to work correctly.

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
 <str name="queryAnalyzerFieldType">textSpell</str>
 <lst name="spellchecker">
  <str name="name">default</str>
  <str name="field">searchFields</str>
  <str name="spellcheckIndexDir">/solr/qa/tradedata/spellchecker</str>
  <str name="buildOnCommit">true</str>
 </lst>
</searchComponent>

We use to search only in one field which is "searchFields" but with
implementing dismax we are searching in different fields like

title^9.0 subtitle^3.0 author^2.0 desc shortdesc imprint category isbn13
isbn10 format series season bisacsub award.

Do we need to modify the above configuration to include all the above
fields:??? Please give me an example.

In the past we use to query twice to get first the suggestions and then we
use to query using the first suggestion to show the data.

Is there a way that we can do it in one step?

Thanks,

Murali




On Wed, Nov 17, 2010 at 7:00 PM, Ahmet Arslan <io...@yahoo.com> wrote:

>
> > 2. How to use spell checker request handler along with
> > dismax?
>
> Just append this at the end of dismax request handler definition:
>
> <arr name="last-components">
>   <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>
>

Re: Dismax - Boosting

Posted by Erick Erickson <er...@gmail.com>.
The changes that you made have no relevance to the fields you named
in your query. Things like author, format, etc. You have to ask to
facet by your new fields...

And if you did send a different query, did you reindex after your config
changes?

It would be better if you made a habit of showing the results of
your query with &debugQuery=on to help us diagnose problems, otherwise
we're just guessing...

Best
Erick

On Thu, Nov 18, 2010 at 1:05 PM, Solr User <so...@gmail.com> wrote:

> Ahmet,
>
> I modified the schema as follows: (Added more fields for faceting)
>
>
> <field name="title" type="text" indexed="true" stored="true"
> omitNorms="true" />
>
> <field name="author" type="text" indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>
> <field name="authortype" type="text" indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>
> <field name="isbn13" type="text" indexed="true" stored="true" />
>
> <field name="isbn10" type="text" indexed="true" stored="true" />
>
> <field name="material" type="text" indexed="true" stored="true" />
>
> <field name="pubdate" type="text" indexed="true" stored="true" />
>
> <field name="pubyear" type="text" indexed="true" stored="true" />
>
> <field name="reldate" type="text" indexed="false" stored="true" />
>
> <field name="format" type="text" indexed="true" stored="true" />
>
> <field name="pages" type="text" indexed="false" stored="true" />
>
> <field name="desc" type="text" indexed="true" stored="true" />
>
> <field name="series" type="text" indexed="true" stored="true" />
>
> <field name="season" type="text" indexed="true" stored="true" />
>
> <field name="imprint" type="text" indexed="true" stored="true" />
>
> <field name="bisacsub" type="text" indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>
> <field name="bisacstatus" type="text" indexed="false" stored="true" />
>
> <field name="category" type="text" indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>
> <field name="award" type="text" indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>
> <field name="age" type="text" indexed="true" stored="true" />
>
> <field name="reading" type="text" indexed="true" stored="true" />
>
> <field name="grade" type="text" indexed="true" stored="true" />
>
> <field name="path" type="text" indexed="false" stored="true" />
>
> <field name="shortdesc" type="text" indexed="true" stored="true" />
>
> <field name="subtitle" type="text" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="price" type="float" indexed="true" stored="true"/>
>
> <field name="author_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="pubyear_facet" type="string" indexed="true" stored="true"
> multiValued="true" omitNorms="true"/>
>
> <field name="format_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="series_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="season_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="imprint_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="category_facet" type="string" indexed="true" stored="true"
> multiValued="true" omitNorms="true"/>
>
> <field name="award_facet" type="string" indexed="true" stored="true"
> multiValued="true" omitNorms="true"/>
>
> <field name="age_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="reading_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="grade_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> <field name="price_facet" type="string" indexed="true" stored="true"
> omitNorms="true"/>
>
> Also added Copy Fields as below:
>
>
> <copyField source="author" dest="author_facet"/>
>
> <copyField source="pubyear" dest="pubyear_facet"/>
>
> <copyField source="format" dest="format_facet"/>
>
> <copyField source="series" dest="series_facet"/>
>
> <copyField source="season" dest="season_facet"/>
>
> <copyField source="imprint" dest="imprint_facet"/>
>
> <copyField source="category" dest="category_facet"/>
>
> <copyField source="award" dest="award_facet"/>
>
> <copyField source="age" dest="age_facet"/>
>
> <copyField source="reading" dest="reading_facet"/>
>
> <copyField source="grade" dest="grade_facet"/>
>
> <copyField source="price" dest="price_facet"/>
> With the above changes I am not getting any facet data as a result.
>
> Why is that the facet data not returning and what mistake I did with the
> schema?
>
> Thanks,
> Solr User
>
> On Wed, Nov 17, 2010 at 6:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>
> >
> >
> > Wow you facet on many fields :
> >
> >
> author,pubyear,format,series,season,imprint,category,award,age,reading,grade,price
> >
> > The fields you facet on should be untokenized type: string, int, tint
> date
> > etc.
> >
> > The fields you want full text search, e.g. the ones you specify in qf, pf
> > parameter should be text type.
> > (title subtitle authordesc shortdesc imprint category isbn13 isbn10
> format
> > series season bisacsub award)
> >
> > If you have common fields, for example category, you need two copy of
> that.
> > one string one text. So that you can both full-text search and facet on.
> > Use copy field for this.
> >
> > <copyField source="category" dest="category_string"/>
> >
> > Example document:
> > category: electronic devices
> >
> >
> > query electronic will return it, and facets on category_string will be
> > displayed as :
> >
> > electronic devices (1)
> >
> > not :
> >
> > electronic (1)
> > devices (1)
> >
> >
> >
> > --- On Wed, 11/17/10, Solr User <so...@gmail.com> wrote:
> >
> > > From: Solr User <so...@gmail.com>
> > > Subject: Re: Dismax - Boosting
> > > To: solr-user@lucene.apache.org
> > > Date: Wednesday, November 17, 2010, 11:31 PM
> >  > Ahmet,
> > >
> > > Thanks for the reply and it was very helpful.
> > >
> > > The query that I used before changing to dismax was:
> > >
> > >
> >
> /solr/tradecore/spell/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> > >
> > > The above query use to return all the data related to
> > > facets, data and also
> > > any suggestions related to spelling mistakes properly.
> > >
> > > The configuration after modifying using dismax is as
> > > below:
> > >
> > > Schema.xml:
> > >
> > >    <field name="title" type="text"
> > > indexed="true" stored="true"
> > > omitNorms="true" />
> > >    <field name="author" type="text"
> > > indexed="true" stored="true"
> > > multiValued="true" omitNorms="true" />
> > >    <field name="authortype" type="text"
> > > indexed="true" stored="true"
> > > multiValued="true" omitNorms="true" />
> > >    <field name="isbn13" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="isbn10" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="material" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="pubdate" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="pubyear" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="reldate" type="text"
> > > indexed="false" stored="true" />
> > >    <field name="format" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="pages" type="text"
> > > indexed="false" stored="true" />
> > >    <field name="desc" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="series" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="season" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="imprint" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="bisacsub" type="text"
> > > indexed="true" stored="true"
> > > multiValued="true" omitNorms="true" />
> > >    <field name="bisacstatus" type="text"
> > > indexed="false" stored="true" />
> > >    <field name="category" type="text"
> > > indexed="true" stored="true"
> > > multiValued="true" omitNorms="true" />
> > >    <field name="award" type="text"
> > > indexed="true" stored="true"
> > > multiValued="true" omitNorms="true" />
> > >    <field name="age" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="reading" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="grade" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="path" type="text"
> > > indexed="false" stored="true" />
> > >    <field name="shortdesc" type="text"
> > > indexed="true" stored="true" />
> > >    <field name="subtitle" type="text"
> > > indexed="true" stored="true"
> > > omitNorms="true"/>
> > >    <field name="price"  type="float"
> > > indexed="true" stored="true"/>
> > >
> > > SolrConfig.xml:
> > >
> > >   <requestHandler name="dismax"
> > > class="solr.SearchHandler" default="true">
> > >     <lst name="defaults">
> > >      <str
> > > name="defType">dismax</str>
> > >      <str
> > > name="echoParams">explicit</str>
> > >      <!-- <float
> > > name="tie">0.01</float> -->
> > >      <str name="qf">
> > >         title^9.0 subtitle^3.0
> > > author^1.0 desc shortdesc imprint category
> > > isbn13 isbn10 format series season bisacsub award
> > >      </str>
> > >      <!--
> > > <str name="pf">
> > >         text^0.2 features^1.1 name^1.5
> > > manu^1.4 manu_exact^1.9
> > >      </str>
> > >      <str name="bf">
> > >         popularity^0.5
> > > recip(price,1,1000,1000)^0.3
> > >      </str>
> > > -->
> > >      <str name="fl">
> > >         *
> > >      </str>
> > > <!--
> > >      <str name="mm">
> > >         2<-1 5<-2
> > > 6<90%
> >  >      </str>
> > >      <int
> > > name="ps">100</int>
> > >      <str
> > > name="q.alt">*:*</str>
> > > -->
> > >      <!-- example highlighter
> > > config, enable per-query with hl=true -->
> > > <!--
> > >      <str name="hl.fl">text
> > > features name</str>
> > > -->
> > >      <!-- for this field, we want no
> > > fragmenting, just highlighting -->
> > > <!--
> > >      <str
> > > name="f.name.hl.fragsize">0</str>
> > > -->
> > >      <!-- instructs Solr to return
> > > the field itself if no query terms are
> > >           found -->
> > > <!--
> > >      <str
> > > name="f.name.hl.alternateField">name</str>
> > >      <str
> > > name="f.text.hl.fragmenter">regex</str>
> > > -->
> > >      <!-- defined below -->
> > >     </lst>
> > >   </requestHandler>
> > >
> > > The query that I used after changing to dismax is:
> > >
> > >
> >
> solr/tradecore/select/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> > >
> > >
> > > The following are the issues that I am having after
> > > modifying to dismax:
> > >
> > > 1. Facets data is not coming correctly. Lot of extra data
> > > is coming. Why and
> > > how to fix it?
> > > 2. How to use spell checker request handler along with
> > > dismax?
> > >
> > > Thanks,
> > > Murali
> > >
> > > On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan <io...@yahoo.com>
> > > wrote:
> > >
> > > > > 1. Do we need to change the above DisMax handler
> > > > > configuration as per our
> > > > > requirements? Or Leave it as it is? What
> > > changes?
> > > >
> > > > Yes, you need to edit it. At least field names. Does
> > > your schema has a
> > > > field named sku?
> > > >
> > > > > 2. Do we need make DisMax as a default request
> > > > > handler?  Do I need to add
> > > > > attribute default="true" to the tag?
> > > >
> > > > If you are going to always use it, why not, change it
> > > by adding
> > > > default="true". By doing so you need to add qt
> > > parameter in every request.
> > > > But don't forget to delete other default="true". There
> > > can be only one
> > > > default="true" :)
> > > >
> > > > > 3. I read in the documentation that Default
> > > Search Handler
> > > > > and DisMax are the same except that to use
> > > DisMaxQueryParser add
> > > > > defType=dismax in the query string. Is there
> > > anything else do we need to
> > > > > do?
> > > >
> > > > Above dismax config contains default parameter list.
> > > So you don't need to
> > > > add &defType=dismax&qf=title^1.0 text^1.5 ...
> > > etc. to the query string.
> > > >
> > > >
> > > > > We are basically moving on to dismax handler and
> > > trying to
> > > > > understand what
> > > > > changes we need to make to SolrConfig.xml.
> > > >
> > > > As you can see in default solrconfig.xml, you can
> > > register multiple
> > > > instances of solr.SearchHandler with different default
> > > parameter list and
> > > > name. default="true" one is executed by default.
> > > >
> > > > And this can be helpful deciding about dismax params:
> > > qf,pf,ps,ps,mm etc
> > > > http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> >
>

Re: Dismax - Boosting

Posted by Solr User <so...@gmail.com>.
Ahmet,

I modified the schema as follows: (Added more fields for faceting)


<field name="title" type="text" indexed="true" stored="true"
omitNorms="true" />

<field name="author" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="authortype" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="isbn13" type="text" indexed="true" stored="true" />

<field name="isbn10" type="text" indexed="true" stored="true" />

<field name="material" type="text" indexed="true" stored="true" />

<field name="pubdate" type="text" indexed="true" stored="true" />

<field name="pubyear" type="text" indexed="true" stored="true" />

<field name="reldate" type="text" indexed="false" stored="true" />

<field name="format" type="text" indexed="true" stored="true" />

<field name="pages" type="text" indexed="false" stored="true" />

<field name="desc" type="text" indexed="true" stored="true" />

<field name="series" type="text" indexed="true" stored="true" />

<field name="season" type="text" indexed="true" stored="true" />

<field name="imprint" type="text" indexed="true" stored="true" />

<field name="bisacsub" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="bisacstatus" type="text" indexed="false" stored="true" />

<field name="category" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="award" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />

<field name="age" type="text" indexed="true" stored="true" />

<field name="reading" type="text" indexed="true" stored="true" />

<field name="grade" type="text" indexed="true" stored="true" />

<field name="path" type="text" indexed="false" stored="true" />

<field name="shortdesc" type="text" indexed="true" stored="true" />

<field name="subtitle" type="text" indexed="true" stored="true"
omitNorms="true"/>

<field name="price" type="float" indexed="true" stored="true"/>

<field name="author_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="pubyear_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="format_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="series_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="season_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="imprint_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="category_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="award_facet" type="string" indexed="true" stored="true"
multiValued="true" omitNorms="true"/>

<field name="age_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="reading_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="grade_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

<field name="price_facet" type="string" indexed="true" stored="true"
omitNorms="true"/>

Also added Copy Fields as below:


<copyField source="author" dest="author_facet"/>

<copyField source="pubyear" dest="pubyear_facet"/>

<copyField source="format" dest="format_facet"/>

<copyField source="series" dest="series_facet"/>

<copyField source="season" dest="season_facet"/>

<copyField source="imprint" dest="imprint_facet"/>

<copyField source="category" dest="category_facet"/>

<copyField source="award" dest="award_facet"/>

<copyField source="age" dest="age_facet"/>

<copyField source="reading" dest="reading_facet"/>

<copyField source="grade" dest="grade_facet"/>

<copyField source="price" dest="price_facet"/>
With the above changes I am not getting any facet data as a result.

Why is that the facet data not returning and what mistake I did with the
schema?

Thanks,
Solr User

On Wed, Nov 17, 2010 at 6:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:

>
>
> Wow you facet on many fields :
>
> author,pubyear,format,series,season,imprint,category,award,age,reading,grade,price
>
> The fields you facet on should be untokenized type: string, int, tint date
> etc.
>
> The fields you want full text search, e.g. the ones you specify in qf, pf
> parameter should be text type.
> (title subtitle authordesc shortdesc imprint category isbn13 isbn10 format
> series season bisacsub award)
>
> If you have common fields, for example category, you need two copy of that.
> one string one text. So that you can both full-text search and facet on.
> Use copy field for this.
>
> <copyField source="category" dest="category_string"/>
>
> Example document:
> category: electronic devices
>
>
> query electronic will return it, and facets on category_string will be
> displayed as :
>
> electronic devices (1)
>
> not :
>
> electronic (1)
> devices (1)
>
>
>
> --- On Wed, 11/17/10, Solr User <so...@gmail.com> wrote:
>
> > From: Solr User <so...@gmail.com>
> > Subject: Re: Dismax - Boosting
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, November 17, 2010, 11:31 PM
>  > Ahmet,
> >
> > Thanks for the reply and it was very helpful.
> >
> > The query that I used before changing to dismax was:
> >
> >
> /solr/tradecore/spell/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> >
> > The above query use to return all the data related to
> > facets, data and also
> > any suggestions related to spelling mistakes properly.
> >
> > The configuration after modifying using dismax is as
> > below:
> >
> > Schema.xml:
> >
> >    <field name="title" type="text"
> > indexed="true" stored="true"
> > omitNorms="true" />
> >    <field name="author" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="authortype" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="isbn13" type="text"
> > indexed="true" stored="true" />
> >    <field name="isbn10" type="text"
> > indexed="true" stored="true" />
> >    <field name="material" type="text"
> > indexed="true" stored="true" />
> >    <field name="pubdate" type="text"
> > indexed="true" stored="true" />
> >    <field name="pubyear" type="text"
> > indexed="true" stored="true" />
> >    <field name="reldate" type="text"
> > indexed="false" stored="true" />
> >    <field name="format" type="text"
> > indexed="true" stored="true" />
> >    <field name="pages" type="text"
> > indexed="false" stored="true" />
> >    <field name="desc" type="text"
> > indexed="true" stored="true" />
> >    <field name="series" type="text"
> > indexed="true" stored="true" />
> >    <field name="season" type="text"
> > indexed="true" stored="true" />
> >    <field name="imprint" type="text"
> > indexed="true" stored="true" />
> >    <field name="bisacsub" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="bisacstatus" type="text"
> > indexed="false" stored="true" />
> >    <field name="category" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="award" type="text"
> > indexed="true" stored="true"
> > multiValued="true" omitNorms="true" />
> >    <field name="age" type="text"
> > indexed="true" stored="true" />
> >    <field name="reading" type="text"
> > indexed="true" stored="true" />
> >    <field name="grade" type="text"
> > indexed="true" stored="true" />
> >    <field name="path" type="text"
> > indexed="false" stored="true" />
> >    <field name="shortdesc" type="text"
> > indexed="true" stored="true" />
> >    <field name="subtitle" type="text"
> > indexed="true" stored="true"
> > omitNorms="true"/>
> >    <field name="price"  type="float"
> > indexed="true" stored="true"/>
> >
> > SolrConfig.xml:
> >
> >   <requestHandler name="dismax"
> > class="solr.SearchHandler" default="true">
> >     <lst name="defaults">
> >      <str
> > name="defType">dismax</str>
> >      <str
> > name="echoParams">explicit</str>
> >      <!-- <float
> > name="tie">0.01</float> -->
> >      <str name="qf">
> >         title^9.0 subtitle^3.0
> > author^1.0 desc shortdesc imprint category
> > isbn13 isbn10 format series season bisacsub award
> >      </str>
> >      <!--
> > <str name="pf">
> >         text^0.2 features^1.1 name^1.5
> > manu^1.4 manu_exact^1.9
> >      </str>
> >      <str name="bf">
> >         popularity^0.5
> > recip(price,1,1000,1000)^0.3
> >      </str>
> > -->
> >      <str name="fl">
> >         *
> >      </str>
> > <!--
> >      <str name="mm">
> >         2<-1 5<-2
> > 6<90%
>  >      </str>
> >      <int
> > name="ps">100</int>
> >      <str
> > name="q.alt">*:*</str>
> > -->
> >      <!-- example highlighter
> > config, enable per-query with hl=true -->
> > <!--
> >      <str name="hl.fl">text
> > features name</str>
> > -->
> >      <!-- for this field, we want no
> > fragmenting, just highlighting -->
> > <!--
> >      <str
> > name="f.name.hl.fragsize">0</str>
> > -->
> >      <!-- instructs Solr to return
> > the field itself if no query terms are
> >           found -->
> > <!--
> >      <str
> > name="f.name.hl.alternateField">name</str>
> >      <str
> > name="f.text.hl.fragmenter">regex</str>
> > -->
> >      <!-- defined below -->
> >     </lst>
> >   </requestHandler>
> >
> > The query that I used after changing to dismax is:
> >
> >
> solr/tradecore/select/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> >
> >
> > The following are the issues that I am having after
> > modifying to dismax:
> >
> > 1. Facets data is not coming correctly. Lot of extra data
> > is coming. Why and
> > how to fix it?
> > 2. How to use spell checker request handler along with
> > dismax?
> >
> > Thanks,
> > Murali
> >
> > On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan <io...@yahoo.com>
> > wrote:
> >
> > > > 1. Do we need to change the above DisMax handler
> > > > configuration as per our
> > > > requirements? Or Leave it as it is? What
> > changes?
> > >
> > > Yes, you need to edit it. At least field names. Does
> > your schema has a
> > > field named sku?
> > >
> > > > 2. Do we need make DisMax as a default request
> > > > handler?  Do I need to add
> > > > attribute default="true" to the tag?
> > >
> > > If you are going to always use it, why not, change it
> > by adding
> > > default="true". By doing so you need to add qt
> > parameter in every request.
> > > But don't forget to delete other default="true". There
> > can be only one
> > > default="true" :)
> > >
> > > > 3. I read in the documentation that Default
> > Search Handler
> > > > and DisMax are the same except that to use
> > DisMaxQueryParser add
> > > > defType=dismax in the query string. Is there
> > anything else do we need to
> > > > do?
> > >
> > > Above dismax config contains default parameter list.
> > So you don't need to
> > > add &defType=dismax&qf=title^1.0 text^1.5 ...
> > etc. to the query string.
> > >
> > >
> > > > We are basically moving on to dismax handler and
> > trying to
> > > > understand what
> > > > changes we need to make to SolrConfig.xml.
> > >
> > > As you can see in default solrconfig.xml, you can
> > register multiple
> > > instances of solr.SearchHandler with different default
> > parameter list and
> > > name. default="true" one is executed by default.
> > >
> > > And this can be helpful deciding about dismax params:
> > qf,pf,ps,ps,mm etc
> > > http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
> > >
> > >
> > >
> > >
> >
>
>
>
>

Re: Dismax - Boosting

Posted by Ahmet Arslan <io...@yahoo.com>.

Wow you facet on many fields :
author,pubyear,format,series,season,imprint,category,award,age,reading,grade,price

The fields you facet on should be untokenized type: string, int, tint date etc.

The fields you want full text search, e.g. the ones you specify in qf, pf parameter should be text type. 
(title subtitle authordesc shortdesc imprint category isbn13 isbn10 format series season bisacsub award)

If you have common fields, for example category, you need two copy of that. one string one text. So that you can both full-text search and facet on.
Use copy field for this. 

<copyField source="category" dest="category_string"/>

Example document:
category: electronic devices


query electronic will return it, and facets on category_string will be displayed as :

electronic devices (1)

not :

electronic (1)
devices (1)



--- On Wed, 11/17/10, Solr User <so...@gmail.com> wrote:

> From: Solr User <so...@gmail.com>
> Subject: Re: Dismax - Boosting
> To: solr-user@lucene.apache.org
> Date: Wednesday, November 17, 2010, 11:31 PM
> Ahmet,
> 
> Thanks for the reply and it was very helpful.
> 
> The query that I used before changing to dismax was:
> 
> /solr/tradecore/spell/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> 
> The above query use to return all the data related to
> facets, data and also
> any suggestions related to spelling mistakes properly.
> 
> The configuration after modifying using dismax is as
> below:
> 
> Schema.xml:
> 
>    <field name="title" type="text"
> indexed="true" stored="true"
> omitNorms="true" />
>    <field name="author" type="text"
> indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>    <field name="authortype" type="text"
> indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>    <field name="isbn13" type="text"
> indexed="true" stored="true" />
>    <field name="isbn10" type="text"
> indexed="true" stored="true" />
>    <field name="material" type="text"
> indexed="true" stored="true" />
>    <field name="pubdate" type="text"
> indexed="true" stored="true" />
>    <field name="pubyear" type="text"
> indexed="true" stored="true" />
>    <field name="reldate" type="text"
> indexed="false" stored="true" />
>    <field name="format" type="text"
> indexed="true" stored="true" />
>    <field name="pages" type="text"
> indexed="false" stored="true" />
>    <field name="desc" type="text"
> indexed="true" stored="true" />
>    <field name="series" type="text"
> indexed="true" stored="true" />
>    <field name="season" type="text"
> indexed="true" stored="true" />
>    <field name="imprint" type="text"
> indexed="true" stored="true" />
>    <field name="bisacsub" type="text"
> indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>    <field name="bisacstatus" type="text"
> indexed="false" stored="true" />
>    <field name="category" type="text"
> indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>    <field name="award" type="text"
> indexed="true" stored="true"
> multiValued="true" omitNorms="true" />
>    <field name="age" type="text"
> indexed="true" stored="true" />
>    <field name="reading" type="text"
> indexed="true" stored="true" />
>    <field name="grade" type="text"
> indexed="true" stored="true" />
>    <field name="path" type="text"
> indexed="false" stored="true" />
>    <field name="shortdesc" type="text"
> indexed="true" stored="true" />
>    <field name="subtitle" type="text"
> indexed="true" stored="true"
> omitNorms="true"/>
>    <field name="price"  type="float"
> indexed="true" stored="true"/>
> 
> SolrConfig.xml:
> 
>   <requestHandler name="dismax"
> class="solr.SearchHandler" default="true">
>     <lst name="defaults">
>      <str
> name="defType">dismax</str>
>      <str
> name="echoParams">explicit</str>
>      <!-- <float
> name="tie">0.01</float> -->
>      <str name="qf">
>         title^9.0 subtitle^3.0
> author^1.0 desc shortdesc imprint category
> isbn13 isbn10 format series season bisacsub award
>      </str>
>      <!--
> <str name="pf">
>         text^0.2 features^1.1 name^1.5
> manu^1.4 manu_exact^1.9
>      </str>
>      <str name="bf">
>         popularity^0.5
> recip(price,1,1000,1000)^0.3
>      </str>
> -->
>      <str name="fl">
>         *
>      </str>
> <!--
>      <str name="mm">
>         2&lt;-1 5&lt;-2
> 6&lt;90%
>      </str>
>      <int
> name="ps">100</int>
>      <str
> name="q.alt">*:*</str>
> -->
>      <!-- example highlighter
> config, enable per-query with hl=true -->
> <!--
>      <str name="hl.fl">text
> features name</str>
> -->
>      <!-- for this field, we want no
> fragmenting, just highlighting -->
> <!--
>      <str
> name="f.name.hl.fragsize">0</str>
> -->
>      <!-- instructs Solr to return
> the field itself if no query terms are
>           found -->
> <!--
>      <str
> name="f.name.hl.alternateField">name</str>
>      <str
> name="f.text.hl.fragmenter">regex</str>
> -->
>      <!-- defined below -->
>     </lst>
>   </requestHandler>
> 
> The query that I used after changing to dismax is:
> 
> solr/tradecore/select/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true
> 
> 
> The following are the issues that I am having after
> modifying to dismax:
> 
> 1. Facets data is not coming correctly. Lot of extra data
> is coming. Why and
> how to fix it?
> 2. How to use spell checker request handler along with
> dismax?
> 
> Thanks,
> Murali
> 
> On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan <io...@yahoo.com>
> wrote:
> 
> > > 1. Do we need to change the above DisMax handler
> > > configuration as per our
> > > requirements? Or Leave it as it is? What
> changes?
> >
> > Yes, you need to edit it. At least field names. Does
> your schema has a
> > field named sku?
> >
> > > 2. Do we need make DisMax as a default request
> > > handler?  Do I need to add
> > > attribute default="true" to the tag?
> >
> > If you are going to always use it, why not, change it
> by adding
> > default="true". By doing so you need to add qt
> parameter in every request.
> > But don't forget to delete other default="true". There
> can be only one
> > default="true" :)
> >
> > > 3. I read in the documentation that Default
> Search Handler
> > > and DisMax are the same except that to use
> DisMaxQueryParser add
> > > defType=dismax in the query string. Is there
> anything else do we need to
> > > do?
> >
> > Above dismax config contains default parameter list.
> So you don't need to
> > add &defType=dismax&qf=title^1.0 text^1.5 ...
> etc. to the query string.
> >
> >
> > > We are basically moving on to dismax handler and
> trying to
> > > understand what
> > > changes we need to make to SolrConfig.xml.
> >
> > As you can see in default solrconfig.xml, you can
> register multiple
> > instances of solr.SearchHandler with different default
> parameter list and
> > name. default="true" one is executed by default.
> >
> > And this can be helpful deciding about dismax params:
> qf,pf,ps,ps,mm etc
> > http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
> >
> >
> >
> >
> 


      

Re: Dismax - Boosting

Posted by Ahmet Arslan <io...@yahoo.com>.
> 2. How to use spell checker request handler along with
> dismax?

Just append this at the end of dismax request handler definition:

<arr name="last-components">
   <str>spellcheck</str>
</arr>

</requestHandler>


      

Re: Dismax - Boosting

Posted by Solr User <so...@gmail.com>.
Ahmet,

Thanks for the reply and it was very helpful.

The query that I used before changing to dismax was:

/solr/tradecore/spell/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true

The above query use to return all the data related to facets, data and also
any suggestions related to spelling mistakes properly.

The configuration after modifying using dismax is as below:

Schema.xml:

   <field name="title" type="text" indexed="true" stored="true"
omitNorms="true" />
   <field name="author" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />
   <field name="authortype" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />
   <field name="isbn13" type="text" indexed="true" stored="true" />
   <field name="isbn10" type="text" indexed="true" stored="true" />
   <field name="material" type="text" indexed="true" stored="true" />
   <field name="pubdate" type="text" indexed="true" stored="true" />
   <field name="pubyear" type="text" indexed="true" stored="true" />
   <field name="reldate" type="text" indexed="false" stored="true" />
   <field name="format" type="text" indexed="true" stored="true" />
   <field name="pages" type="text" indexed="false" stored="true" />
   <field name="desc" type="text" indexed="true" stored="true" />
   <field name="series" type="text" indexed="true" stored="true" />
   <field name="season" type="text" indexed="true" stored="true" />
   <field name="imprint" type="text" indexed="true" stored="true" />
   <field name="bisacsub" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />
   <field name="bisacstatus" type="text" indexed="false" stored="true" />
   <field name="category" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />
   <field name="award" type="text" indexed="true" stored="true"
multiValued="true" omitNorms="true" />
   <field name="age" type="text" indexed="true" stored="true" />
   <field name="reading" type="text" indexed="true" stored="true" />
   <field name="grade" type="text" indexed="true" stored="true" />
   <field name="path" type="text" indexed="false" stored="true" />
   <field name="shortdesc" type="text" indexed="true" stored="true" />
   <field name="subtitle" type="text" indexed="true" stored="true"
omitNorms="true"/>
   <field name="price"  type="float" indexed="true" stored="true"/>

SolrConfig.xml:

  <requestHandler name="dismax" class="solr.SearchHandler" default="true">
    <lst name="defaults">
     <str name="defType">dismax</str>
     <str name="echoParams">explicit</str>
     <!-- <float name="tie">0.01</float> -->
     <str name="qf">
        title^9.0 subtitle^3.0 author^1.0 desc shortdesc imprint category
isbn13 isbn10 format series season bisacsub award
     </str>
     <!--
<str name="pf">
        text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
     </str>
     <str name="bf">
        popularity^0.5 recip(price,1,1000,1000)^0.3
     </str>
-->
     <str name="fl">
        *
     </str>
<!--
     <str name="mm">
        2&lt;-1 5&lt;-2 6&lt;90%
     </str>
     <int name="ps">100</int>
     <str name="q.alt">*:*</str>
-->
     <!-- example highlighter config, enable per-query with hl=true -->
<!--
     <str name="hl.fl">text features name</str>
-->
     <!-- for this field, we want no fragmenting, just highlighting -->
<!--
     <str name="f.name.hl.fragsize">0</str>
-->
     <!-- instructs Solr to return the field itself if no query terms are
          found -->
<!--
     <str name="f.name.hl.alternateField">name</str>
     <str name="f.text.hl.fragmenter">regex</str>
-->
     <!-- defined below -->
    </lst>
  </requestHandler>

The query that I used after changing to dismax is:

solr/tradecore/select/?q=curious&wt=json&rows=9&facet=true&facet.limit=-1&facet.mincount=1&facet.field=author&facet.field=pubyear&facet.field=format&facet.field=series&facet.field=season&facet.field=imprint&facet.field=category&facet.field=award&facet.field=age&facet.field=reading&facet.field=grade&facet.field=price&spellcheck=true


The following are the issues that I am having after modifying to dismax:

1. Facets data is not coming correctly. Lot of extra data is coming. Why and
how to fix it?
2. How to use spell checker request handler along with dismax?

Thanks,
Murali

On Mon, Nov 15, 2010 at 5:38 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > 1. Do we need to change the above DisMax handler
> > configuration as per our
> > requirements? Or Leave it as it is? What changes?
>
> Yes, you need to edit it. At least field names. Does your schema has a
> field named sku?
>
> > 2. Do we need make DisMax as a default request
> > handler?  Do I need to add
> > attribute default="true" to the tag?
>
> If you are going to always use it, why not, change it by adding
> default="true". By doing so you need to add qt parameter in every request.
> But don't forget to delete other default="true". There can be only one
> default="true" :)
>
> > 3. I read in the documentation that Default Search Handler
> > and DisMax are the same except that to use DisMaxQueryParser add
> > defType=dismax in the query string. Is there anything else do we need to
> > do?
>
> Above dismax config contains default parameter list. So you don't need to
> add &defType=dismax&qf=title^1.0 text^1.5 ... etc. to the query string.
>
>
> > We are basically moving on to dismax handler and trying to
> > understand what
> > changes we need to make to SolrConfig.xml.
>
> As you can see in default solrconfig.xml, you can register multiple
> instances of solr.SearchHandler with different default parameter list and
> name. default="true" one is executed by default.
>
> And this can be helpful deciding about dismax params: qf,pf,ps,ps,mm etc
> http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/
>
>
>
>

Re: Dismax - Boosting

Posted by Ahmet Arslan <io...@yahoo.com>.
> 1. Do we need to change the above DisMax handler
> configuration as per our
> requirements? Or Leave it as it is? What changes?

Yes, you need to edit it. At least field names. Does your schema has a field named sku?

> 2. Do we need make DisMax as a default request
> handler?  Do I need to add
> attribute default="true" to the tag?

If you are going to always use it, why not, change it by adding default="true". By doing so you need to add qt parameter in every request. But don't forget to delete other default="true". There can be only one default="true" :)

> 3. I read in the documentation that Default Search Handler
> and DisMax are the same except that to use DisMaxQueryParser add
> defType=dismax in the query string. Is there anything else do we need to > do?

Above dismax config contains default parameter list. So you don't need to add &defType=dismax&qf=title^1.0 text^1.5 ... etc. to the query string.


> We are basically moving on to dismax handler and trying to
> understand what
> changes we need to make to SolrConfig.xml. 

As you can see in default solrconfig.xml, you can register multiple instances of solr.SearchHandler with different default parameter list and name. default="true" one is executed by default. 

And this can be helpful deciding about dismax params: qf,pf,ps,ps,mm etc
http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/