You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Robert Stewart <bs...@gmail.com> on 2012/03/05 22:22:44 UTC

wildcard queries with edismax and lucene query parsers

How is scoring affected by wildcard queries?  Seems when I use a
wildcard query I get all constant scores in response (all scores =
1.0).  That occurs with both edismax as well as lucene query parser.
I am trying to implement auto-suggest feature so I need to use wild
card to return all results that match the prefix entered by a user.
But I want the results sorted according to score defined by the "qf"
parameter in my search handler.

?defType=edismax&q=grow*&fl=title,score

<result name="response" numFound="11" start="0" maxScore="1.0">
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>


?defType=lucene&q=grow*&fl=title,score

<result name="response" numFound="11" start="0" maxScore="1.0">
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>

If I use query with no wildcard, scoring appears correct:

?defType=edismax&q=growth&fl=title,score

<result name="response" numFound="11" start="0" maxScore="0.7500377">
<doc>
<float name="score">0.7500377</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">0.7500377</float>
<arr name="title">
<str>S&P 500 Growth</str>
</arr>
</doc>
<doc>
<float name="score">0.656283</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>

I am using SOLR version 3.2 and using a request handler defined like this:

<requestHandler name="/idxsuggest" class="solr.SearchHandler">
       <lst name="defaults">
         <str name="echoParams">explicit</str>
         <int name="rows">10</int>
         <str name="defType">edismax</str>
          <str name="q">*:*</str>
         <str name="qf">
                  ticker^10.0 indexCode^10.0 indexKey^10.0 title^5.0
indexName^5.0
         </str>
          <str name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
      </lst>
  <lst name="appends">
          <!-- Filter out documents that are not published yet and
that are not yet expired -->
          <str name="fq">+contentType:IndexProfile</str>
       </lst>
   </requestHandler>

Re: wildcard queries with edismax and lucene query parsers

Posted by Robert Stewart <bs...@gmail.com>.
Ahmet,

That is a great idea.  I will try it.

Thank you.

On Thu, Mar 8, 2012 at 9:34 AM, Ahmet Arslan <io...@yahoo.com> wrote:
> WildcardQueries are wrapped into ConstantScoreQuery.
>
> I would create a copy field of these fields using the following field type.
>
> Then you can search on these copyFields (qf). With this approach you don't need to use start operator. defType=edismax&q=grow&fl=title,score
>
> <fieldType name="prefix_token" class="solr.TextField" positionIncrementGap="1">
>                <analyzer type="index">
>                        <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
>                        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>                        <filter class="solr.LowerCaseFilterFactory" />
>                        <filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true" />
>                        <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" />
>                </analyzer>
>                <analyzer type="query">
>                        <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
>                        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>                        <filter class="solr.LowerCaseFilterFactory" />
>                </analyzer>
>        </fieldType>
>
>
> --- On Thu, 3/8/12, Robert Stewart <bs...@gmail.com> wrote:
>
>> From: Robert Stewart <bs...@gmail.com>
>> Subject: Re: wildcard queries with edismax and lucene query parsers
>> To: solr-user@lucene.apache.org
>> Date: Thursday, March 8, 2012, 4:21 PM
>> Any help on this?  I am really
>> stuck on a client project.  I need to
>> know how scoring works with wildcard queries under SOLR
>> 3.2.
>>
>> Thanks
>> Bob
>>
>> On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com>
>> wrote:
>> > How is scoring affected by wildcard queries?  Seems
>> when I use a
>> > wildcard query I get all constant scores in response
>> (all scores =
>> > 1.0).  That occurs with both edismax as well as lucene
>> query parser.
>> > I am trying to implement auto-suggest feature so I need
>> to use wild
>> > card to return all results that match the prefix
>> entered by a user.
>> > But I want the results sorted according to score
>> defined by the "qf"
>> > parameter in my search handler.
>> >
>> > ?defType=edismax&q=grow*&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="1.0">
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> >
>> > ?defType=lucene&q=grow*&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="1.0">
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> > If I use query with no wildcard, scoring appears
>> correct:
>> >
>> > ?defType=edismax&q=growth&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="0.7500377">
>> > <doc>
>> > <float name="score">0.7500377</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">0.7500377</float>
>> > <arr name="title">
>> > <str>S&P 500 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">0.656283</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> > I am using SOLR version 3.2 and using a request handler
>> defined like this:
>> >
>> > <requestHandler name="/idxsuggest"
>> class="solr.SearchHandler">
>> >       <lst name="defaults">
>> >         <str
>> name="echoParams">explicit</str>
>> >         <int name="rows">10</int>
>> >         <str
>> name="defType">edismax</str>
>> >          <str name="q">*:*</str>
>> >         <str name="qf">
>> >                  ticker^10.0 indexCode^10.0
>> indexKey^10.0 title^5.0
>> > indexName^5.0
>> >         </str>
>> >          <str
>> name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
>> >      </lst>
>> >  <lst name="appends">
>> >          <!-- Filter out documents that are not
>> published yet and
>> > that are not yet expired -->
>> >          <str
>> name="fq">+contentType:IndexProfile</str>
>> >       </lst>
>> >   </requestHandler>
>>

Re: wildcard queries with edismax and lucene query parsers

Posted by Ahmet Arslan <io...@yahoo.com>.
WildcardQueries are wrapped into ConstantScoreQuery.

I would create a copy field of these fields using the following field type.

Then you can search on these copyFields (qf). With this approach you don't need to use start operator. defType=edismax&q=grow&fl=title,score

<fieldType name="prefix_token" class="solr.TextField" positionIncrementGap="1">
		<analyzer type="index">
			<charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
			<tokenizer class="solr.WhitespaceTokenizerFactory" /> 
			<filter class="solr.LowerCaseFilterFactory" /> 
			<filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true" /> 
			<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" /> 
		</analyzer>
		<analyzer type="query">
			<charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
			<tokenizer class="solr.WhitespaceTokenizerFactory" /> 
			<filter class="solr.LowerCaseFilterFactory" /> 
		</analyzer>
	</fieldType>


--- On Thu, 3/8/12, Robert Stewart <bs...@gmail.com> wrote:

> From: Robert Stewart <bs...@gmail.com>
> Subject: Re: wildcard queries with edismax and lucene query parsers
> To: solr-user@lucene.apache.org
> Date: Thursday, March 8, 2012, 4:21 PM
> Any help on this?  I am really
> stuck on a client project.  I need to
> know how scoring works with wildcard queries under SOLR
> 3.2.
> 
> Thanks
> Bob
> 
> On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com>
> wrote:
> > How is scoring affected by wildcard queries?  Seems
> when I use a
> > wildcard query I get all constant scores in response
> (all scores =
> > 1.0).  That occurs with both edismax as well as lucene
> query parser.
> > I am trying to implement auto-suggest feature so I need
> to use wild
> > card to return all results that match the prefix
> entered by a user.
> > But I want the results sorted according to score
> defined by the "qf"
> > parameter in my search handler.
> >
> > ?defType=edismax&q=grow*&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="1.0">
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> >
> > ?defType=lucene&q=grow*&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="1.0">
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> > If I use query with no wildcard, scoring appears
> correct:
> >
> > ?defType=edismax&q=growth&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="0.7500377">
> > <doc>
> > <float name="score">0.7500377</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">0.7500377</float>
> > <arr name="title">
> > <str>S&P 500 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">0.656283</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> > I am using SOLR version 3.2 and using a request handler
> defined like this:
> >
> > <requestHandler name="/idxsuggest"
> class="solr.SearchHandler">
> >       <lst name="defaults">
> >         <str
> name="echoParams">explicit</str>
> >         <int name="rows">10</int>
> >         <str
> name="defType">edismax</str>
> >          <str name="q">*:*</str>
> >         <str name="qf">
> >                  ticker^10.0 indexCode^10.0
> indexKey^10.0 title^5.0
> > indexName^5.0
> >         </str>
> >          <str
> name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
> >      </lst>
> >  <lst name="appends">
> >          <!-- Filter out documents that are not
> published yet and
> > that are not yet expired -->
> >          <str
> name="fq">+contentType:IndexProfile</str>
> >       </lst>
> >   </requestHandler>
> 

Re: wildcard queries with edismax and lucene query parsers

Posted by Robert Stewart <bs...@gmail.com>.
Any help on this?  I am really stuck on a client project.  I need to
know how scoring works with wildcard queries under SOLR 3.2.

Thanks
Bob

On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com> wrote:
> How is scoring affected by wildcard queries?  Seems when I use a
> wildcard query I get all constant scores in response (all scores =
> 1.0).  That occurs with both edismax as well as lucene query parser.
> I am trying to implement auto-suggest feature so I need to use wild
> card to return all results that match the prefix entered by a user.
> But I want the results sorted according to score defined by the "qf"
> parameter in my search handler.
>
> ?defType=edismax&q=grow*&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="1.0">
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
>
> ?defType=lucene&q=grow*&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="1.0">
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
> If I use query with no wildcard, scoring appears correct:
>
> ?defType=edismax&q=growth&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="0.7500377">
> <doc>
> <float name="score">0.7500377</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">0.7500377</float>
> <arr name="title">
> <str>S&P 500 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">0.656283</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
> I am using SOLR version 3.2 and using a request handler defined like this:
>
> <requestHandler name="/idxsuggest" class="solr.SearchHandler">
>       <lst name="defaults">
>         <str name="echoParams">explicit</str>
>         <int name="rows">10</int>
>         <str name="defType">edismax</str>
>          <str name="q">*:*</str>
>         <str name="qf">
>                  ticker^10.0 indexCode^10.0 indexKey^10.0 title^5.0
> indexName^5.0
>         </str>
>          <str name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
>      </lst>
>  <lst name="appends">
>          <!-- Filter out documents that are not published yet and
> that are not yet expired -->
>          <str name="fq">+contentType:IndexProfile</str>
>       </lst>
>   </requestHandler>