You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Robert Stewart <bs...@gmail.com> on 2012/03/05 22:22:44 UTC
wildcard queries with edismax and lucene query parsers
How is scoring affected by wildcard queries? Seems when I use a
wildcard query I get all constant scores in response (all scores =
1.0). That occurs with both edismax as well as lucene query parser.
I am trying to implement auto-suggest feature so I need to use wild
card to return all results that match the prefix entered by a user.
But I want the results sorted according to score defined by the "qf"
parameter in my search handler.
?defType=edismax&q=grow*&fl=title,score
<result name="response" numFound="11" start="0" maxScore="1.0">
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>
?defType=lucene&q=grow*&fl=title,score
<result name="response" numFound="11" start="0" maxScore="1.0">
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">1.0</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>
If I use query with no wildcard, scoring appears correct:
?defType=edismax&q=growth&fl=title,score
<result name="response" numFound="11" start="0" maxScore="0.7500377">
<doc>
<float name="score">0.7500377</float>
<arr name="title">
<str>S&P 1000 Growth</str>
</arr>
</doc>
<doc>
<float name="score">0.7500377</float>
<arr name="title">
<str>S&P 500 Growth</str>
</arr>
</doc>
<doc>
<float name="score">0.656283</float>
<arr name="title">
<str>S&P 1000 Pure Growth</str>
</arr>
</doc>
I am using SOLR version 3.2 and using a request handler defined like this:
<requestHandler name="/idxsuggest" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="q">*:*</str>
<str name="qf">
ticker^10.0 indexCode^10.0 indexKey^10.0 title^5.0
indexName^5.0
</str>
<str name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
</lst>
<lst name="appends">
<!-- Filter out documents that are not published yet and
that are not yet expired -->
<str name="fq">+contentType:IndexProfile</str>
</lst>
</requestHandler>
Re: wildcard queries with edismax and lucene query parsers
Posted by Robert Stewart <bs...@gmail.com>.
Ahmet,
That is a great idea. I will try it.
Thank you.
On Thu, Mar 8, 2012 at 9:34 AM, Ahmet Arslan <io...@yahoo.com> wrote:
> WildcardQueries are wrapped into ConstantScoreQuery.
>
> I would create a copy field of these fields using the following field type.
>
> Then you can search on these copyFields (qf). With this approach you don't need to use start operator. defType=edismax&q=grow&fl=title,score
>
> <fieldType name="prefix_token" class="solr.TextField" positionIncrementGap="1">
> <analyzer type="index">
> <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> <filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true" />
> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" />
> </analyzer>
> <analyzer type="query">
> <charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
> <tokenizer class="solr.WhitespaceTokenizerFactory" />
> <filter class="solr.LowerCaseFilterFactory" />
> </analyzer>
> </fieldType>
>
>
> --- On Thu, 3/8/12, Robert Stewart <bs...@gmail.com> wrote:
>
>> From: Robert Stewart <bs...@gmail.com>
>> Subject: Re: wildcard queries with edismax and lucene query parsers
>> To: solr-user@lucene.apache.org
>> Date: Thursday, March 8, 2012, 4:21 PM
>> Any help on this? I am really
>> stuck on a client project. I need to
>> know how scoring works with wildcard queries under SOLR
>> 3.2.
>>
>> Thanks
>> Bob
>>
>> On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com>
>> wrote:
>> > How is scoring affected by wildcard queries? Seems
>> when I use a
>> > wildcard query I get all constant scores in response
>> (all scores =
>> > 1.0). That occurs with both edismax as well as lucene
>> query parser.
>> > I am trying to implement auto-suggest feature so I need
>> to use wild
>> > card to return all results that match the prefix
>> entered by a user.
>> > But I want the results sorted according to score
>> defined by the "qf"
>> > parameter in my search handler.
>> >
>> > ?defType=edismax&q=grow*&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="1.0">
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> >
>> > ?defType=lucene&q=grow*&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="1.0">
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">1.0</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> > If I use query with no wildcard, scoring appears
>> correct:
>> >
>> > ?defType=edismax&q=growth&fl=title,score
>> >
>> > <result name="response" numFound="11" start="0"
>> maxScore="0.7500377">
>> > <doc>
>> > <float name="score">0.7500377</float>
>> > <arr name="title">
>> > <str>S&P 1000 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">0.7500377</float>
>> > <arr name="title">
>> > <str>S&P 500 Growth</str>
>> > </arr>
>> > </doc>
>> > <doc>
>> > <float name="score">0.656283</float>
>> > <arr name="title">
>> > <str>S&P 1000 Pure Growth</str>
>> > </arr>
>> > </doc>
>> >
>> > I am using SOLR version 3.2 and using a request handler
>> defined like this:
>> >
>> > <requestHandler name="/idxsuggest"
>> class="solr.SearchHandler">
>> > <lst name="defaults">
>> > <str
>> name="echoParams">explicit</str>
>> > <int name="rows">10</int>
>> > <str
>> name="defType">edismax</str>
>> > <str name="q">*:*</str>
>> > <str name="qf">
>> > ticker^10.0 indexCode^10.0
>> indexKey^10.0 title^5.0
>> > indexName^5.0
>> > </str>
>> > <str
>> name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
>> > </lst>
>> > <lst name="appends">
>> > <!-- Filter out documents that are not
>> published yet and
>> > that are not yet expired -->
>> > <str
>> name="fq">+contentType:IndexProfile</str>
>> > </lst>
>> > </requestHandler>
>>
Re: wildcard queries with edismax and lucene query parsers
Posted by Ahmet Arslan <io...@yahoo.com>.
WildcardQueries are wrapped into ConstantScoreQuery.
I would create a copy field of these fields using the following field type.
Then you can search on these copyFields (qf). With this approach you don't need to use start operator. defType=edismax&q=grow&fl=title,score
<fieldType name="prefix_token" class="solr.TextField" positionIncrementGap="1">
<analyzer type="index">
<charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms_index.txt" ignoreCase="true" expand="true" />
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="20" />
</analyzer>
<analyzer type="query">
<charFilter class="solr.MappingCharFilterFactory" mapping="mappings.txt"/>
<tokenizer class="solr.WhitespaceTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
--- On Thu, 3/8/12, Robert Stewart <bs...@gmail.com> wrote:
> From: Robert Stewart <bs...@gmail.com>
> Subject: Re: wildcard queries with edismax and lucene query parsers
> To: solr-user@lucene.apache.org
> Date: Thursday, March 8, 2012, 4:21 PM
> Any help on this? I am really
> stuck on a client project. I need to
> know how scoring works with wildcard queries under SOLR
> 3.2.
>
> Thanks
> Bob
>
> On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com>
> wrote:
> > How is scoring affected by wildcard queries? Seems
> when I use a
> > wildcard query I get all constant scores in response
> (all scores =
> > 1.0). That occurs with both edismax as well as lucene
> query parser.
> > I am trying to implement auto-suggest feature so I need
> to use wild
> > card to return all results that match the prefix
> entered by a user.
> > But I want the results sorted according to score
> defined by the "qf"
> > parameter in my search handler.
> >
> > ?defType=edismax&q=grow*&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="1.0">
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> >
> > ?defType=lucene&q=grow*&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="1.0">
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">1.0</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> > If I use query with no wildcard, scoring appears
> correct:
> >
> > ?defType=edismax&q=growth&fl=title,score
> >
> > <result name="response" numFound="11" start="0"
> maxScore="0.7500377">
> > <doc>
> > <float name="score">0.7500377</float>
> > <arr name="title">
> > <str>S&P 1000 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">0.7500377</float>
> > <arr name="title">
> > <str>S&P 500 Growth</str>
> > </arr>
> > </doc>
> > <doc>
> > <float name="score">0.656283</float>
> > <arr name="title">
> > <str>S&P 1000 Pure Growth</str>
> > </arr>
> > </doc>
> >
> > I am using SOLR version 3.2 and using a request handler
> defined like this:
> >
> > <requestHandler name="/idxsuggest"
> class="solr.SearchHandler">
> > <lst name="defaults">
> > <str
> name="echoParams">explicit</str>
> > <int name="rows">10</int>
> > <str
> name="defType">edismax</str>
> > <str name="q">*:*</str>
> > <str name="qf">
> > ticker^10.0 indexCode^10.0
> indexKey^10.0 title^5.0
> > indexName^5.0
> > </str>
> > <str
> name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
> > </lst>
> > <lst name="appends">
> > <!-- Filter out documents that are not
> published yet and
> > that are not yet expired -->
> > <str
> name="fq">+contentType:IndexProfile</str>
> > </lst>
> > </requestHandler>
>
Re: wildcard queries with edismax and lucene query parsers
Posted by Robert Stewart <bs...@gmail.com>.
Any help on this? I am really stuck on a client project. I need to
know how scoring works with wildcard queries under SOLR 3.2.
Thanks
Bob
On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart <bs...@gmail.com> wrote:
> How is scoring affected by wildcard queries? Seems when I use a
> wildcard query I get all constant scores in response (all scores =
> 1.0). That occurs with both edismax as well as lucene query parser.
> I am trying to implement auto-suggest feature so I need to use wild
> card to return all results that match the prefix entered by a user.
> But I want the results sorted according to score defined by the "qf"
> parameter in my search handler.
>
> ?defType=edismax&q=grow*&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="1.0">
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
>
> ?defType=lucene&q=grow*&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="1.0">
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">1.0</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
> If I use query with no wildcard, scoring appears correct:
>
> ?defType=edismax&q=growth&fl=title,score
>
> <result name="response" numFound="11" start="0" maxScore="0.7500377">
> <doc>
> <float name="score">0.7500377</float>
> <arr name="title">
> <str>S&P 1000 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">0.7500377</float>
> <arr name="title">
> <str>S&P 500 Growth</str>
> </arr>
> </doc>
> <doc>
> <float name="score">0.656283</float>
> <arr name="title">
> <str>S&P 1000 Pure Growth</str>
> </arr>
> </doc>
>
> I am using SOLR version 3.2 and using a request handler defined like this:
>
> <requestHandler name="/idxsuggest" class="solr.SearchHandler">
> <lst name="defaults">
> <str name="echoParams">explicit</str>
> <int name="rows">10</int>
> <str name="defType">edismax</str>
> <str name="q">*:*</str>
> <str name="qf">
> ticker^10.0 indexCode^10.0 indexKey^10.0 title^5.0
> indexName^5.0
> </str>
> <str name="fl">indexId,indexName,indexCode,indexKey,title,ticker,urlTitle</str>
> </lst>
> <lst name="appends">
> <!-- Filter out documents that are not published yet and
> that are not yet expired -->
> <str name="fq">+contentType:IndexProfile</str>
> </lst>
> </requestHandler>