You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marek Tichy <ma...@gn.apc.org> on 2011/04/14 20:41:39 UTC

Dismax parameters weirdness - boosting/mm causes no match ?

Hi,

I'd really appreciate if somebody could help me.
Solr 3.1, I have a (multivalued) field called author_split, which splits
original author names, i.e. "Mills, Joy". When I use the solr web
interface to enter

    author_split:mills

the ExtendedDismaxQParser produces query

    +author_split:mills

which returns 12 results. When I try

   +author_split:joy

it returns 15 results (Fielding, Joy is responsible for the extra 3).
And now the mystery:

When I use my web app, entering query mills produces (parsedquery_toString):

    +(author_split:mills^80.0)~1.0 ()

and that gives NO RESULTS !

However, the other search

    +(author_split:joy^80.0)~1.0 ()

returns the expected 15 results ?

Many thanks for any reply, I'l loosing hope !

Marek Tichy


PS: Below is the of parameters I'm sending alongside and the fieldtype
definition

    *
      mm (String, 1 characters ) 1
    *
      facet (String, 2 characters ) on
    *
      facet.mincount (String, 1 characters ) 1
    *
      tie (String, 1 characters ) 1
    *
      qf (String, 17 characters ) author_split^80.0
    *
      json.nl (String, 3 characters ) map
    *
      wt (String, 4 characters ) json
    *
      version (String, 3 characters ) 1.2
    *
      rows (String, 2 characters ) 10
    *
      pf (String, 0 characters )
    *
      debugQuery (String, 4 characters ) true
    *
      start (String, 1 characters ) 0
    *
      q (String, 5 characters ) mills
    *
      facet.field (Array, 2 elements) | (Callback) publisher::author();
    *
      qt (String, 11 characters ) partitioned


<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-FoldToASCII.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="0"
                catenateWords="0"
                catenateNumbers="0"
                catenateAll="0"
                splitOnCaseChange="0"
                preserveOriginal="0"/>
      </analyzer>
      <analyzer type="query">
        <charFilter class="solr.MappingCharFilterFactory"
mapping="mapping-FoldToASCII.txt"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
                generateWordParts="1"
                generateNumberParts="0"
                catenateWords="0"
                catenateNumbers="0"
                catenateAll="0"
                splitOnCaseChange="0"
                preserveOriginal="0"/>
      </analyzer>
    </fieldType>

<field name="author_split" type="text_ws" indexed="true"
multiValued="true"/>







Re: Dismax parameters weirdness - boosting/mm causes no match ?

Posted by Erick Erickson <er...@gmail.com>.
Been there, done that, got the tshirt <G>...

Glad it's resolved.
Erick

On Thu, Apr 14, 2011 at 5:33 PM, Marek Tichy <ma...@gn.apc.org> wrote:

> Hi Erick,
> many many thanks. Headbang^80.0.
> I was just getting to the same conclusion - that I'm not reaching
> handler I think I am, when I got your message. As almost always, the
> mystery has a simple explanation. Not only that I was not reaching the
> handler I thought I was. In fact I was not reaching the server I thought
> I was.
> With web interface yes, with my app no. My app was still connected to
> the server I have used previously, which now has an outdated index.
> One line, 6 hours of debugging.
>
> Many thanks for your prompt help !
>
> Marek Tichy
>
>
>
>
>
> > OK, something doesn't make sense here, but you knew that. I'm guessing
> > that something you're leaving out when cutting/pasting will give us a
> clue.
> > Here's what I'd like to try.
> >
> > 1> Show us the entire results from &debugQuery=on, as well as
> > the URL. Please do this for both the query that works and the one
> > that doesn't.
> >
> > 2> Take a look in Solr admin and see if what you think is in your
> > author_split field really is there.
> >
> > Because this
> >
> > (author_split:joy^80.0)~1.0 ()
> >
> > usually indicates that you're not getting to the request handler you
> think
> > you are.
> >
> > As I say, nothing jumps out at me, which just means I'm not looking at
> the
> > right parts....
> >
> > Best
> > Erick
> >
> > On Thu, Apr 14, 2011 at 4:01 PM, Marek Tichy <ma...@gn.apc.org> wrote:
> >
> >
> >> Hi Eric,
> >> thanks for the reply.
> >> I'm sending qt=partitioned, the request handler is defined as follows:
> >>
> >> <requestHandler name="partitioned" class="solr.SearchHandler"
> >> default="true">
> >>    <lst name="defaults">
> >>     <str name="defType">edismax</str>
> >>     <str name="echoParams">explicit</str>
> >>     <float name="tie">0.1</float>
> >>     <str name="qf">
> >>        author_split^80.0 title^20.0 publisher_split^8.0
> >>     </str>
> >>     <str name="pf">
> >>     </str>
> >>     <int name="ps">15</int>
> >>     <!-- Abort any searches longer than 4 seconds -->
> >>     <!-- <int name="timeAllowed">4000</int>  -->
> >>
> >>     <!--str name="mm">
> >>        2&lt;-35%
> >>     </str-->
> >>     <str name="q.alt">*:*</str>
> >>
> >>   <!-- example highlighter config, enable per-query with hl=true -->
> >>     <str name="hl">true</str>
> >>     <str name="hl.fl">synopsis,author,title,publisher,keywords</str>
> >>     <int name="hl.snippets">3</int>
> >>     <str name="hl.mergeContiguous">true</str>
> >>   <!-- instructs Solr to return the field itself if no query terms are
> >>        found -->
> >>     <str name="f.body.hl.alternateField">synopsis</str>
> >>     <str name="f.body.hl.maxAlternateFieldLength">256</str>
> >>     <!-- JS: I wasn't getting good results here... I'm turning off for
> now
> >>     because I was getting periods (.) by themselves at the beginning of
> >>     snippets and don't feel like debugging anymore.  Without the regex
> is
> >>     faster too -->
> >>     <!--<str name="f.body.hl.fragmenter">regex</str>--> <!-- defined
> >> below -->
> >>
> >>    <!-- By default, don't spell check -->
> >>      <str name="spellcheck">false</str>
> >>    <!-- Defaults for the spell checker when used -->
> >>      <str name="spellcheck.onlyMorePopular">true</str>
> >>      <str name="spellcheck.extendedResults">false</str>
> >>      <!--  The number of suggestions to return -->
> >>      <str name="spellcheck.count">1</str>
> >>    </lst>
> >>    <arr name="last-components">
> >>      <str>spellcheck</str>
> >>    </arr>
> >>  </requestHandler>
> >>
> >>
> >>
> >>> Let's see the raw query you send from your webapp.
> >>>
> >>> Based on this:
> >>> +(author_split:joy^80.0)~1.0 ()
> >>>
> >>> I'm going to guess you send your query something like
> >>> blahblahblah&defType=edismax&blahblahblah
> >>>
> >>> rather than
> >>> blahblahblah&qt=edismax&blahblahblah
> >>>
> >>> The former expects you to define the all your parameters on the URL
> >>> (e.g. the qf, mm, etc). The latter looks for a request handler named
> >>> "edismax"
> >>> in your solrconfig.xml.
> >>>
> >>> Which is really confusing....
> >>>
> >>> This may be irrlevant, but I know I stumbled over this and so I thought
> >>>
> >> I'd
> >>
> >>> mention it.
> >>>
> >>> Best
> >>> Erick
> >>>
> >>> 2011/4/14 Marek Tichy <ma...@gn.apc.org>
> >>>
> >>>
> >>>
> >>>> Hi,
> >>>>
> >>>> I'd really appreciate if somebody could help me.
> >>>> Solr 3.1, I have a (multivalued) field called author_split, which
> splits
> >>>> original author names, i.e. "Mills, Joy". When I use the solr web
> >>>> interface to enter
> >>>>
> >>>>    author_split:mills
> >>>>
> >>>> the ExtendedDismaxQParser produces query
> >>>>
> >>>>    +author_split:mills
> >>>>
> >>>> which returns 12 results. When I try
> >>>>
> >>>>   +author_split:joy
> >>>>
> >>>> it returns 15 results (Fielding, Joy is responsible for the extra 3).
> >>>> And now the mystery:
> >>>>
> >>>> When I use my web app, entering query mills produces
> >>>> (parsedquery_toString):
> >>>>
> >>>>    +(author_split:mills^80.0)~1.0 ()
> >>>>
> >>>> and that gives NO RESULTS !
> >>>>
> >>>> However, the other search
> >>>>
> >>>>    +(author_split:joy^80.0)~1.0 ()
> >>>>
> >>>> returns the expected 15 results ?
> >>>>
> >>>> Many thanks for any reply, I'l loosing hope !
> >>>>
> >>>> Marek Tichy
> >>>>
> >>>>
> >>>> PS: Below is the of parameters I'm sending alongside and the fieldtype
> >>>> definition
> >>>>
> >>>>    *
> >>>>      mm (String, 1 characters ) 1
> >>>>    *
> >>>>      facet (String, 2 characters ) on
> >>>>    *
> >>>>      facet.mincount (String, 1 characters ) 1
> >>>>    *
> >>>>      tie (String, 1 characters ) 1
> >>>>    *
> >>>>      qf (String, 17 characters ) author_split^80.0
> >>>>    *
> >>>>      json.nl (String, 3 characters ) map
> >>>>    *
> >>>>      wt (String, 4 characters ) json
> >>>>    *
> >>>>      version (String, 3 characters ) 1.2
> >>>>    *
> >>>>      rows (String, 2 characters ) 10
> >>>>    *
> >>>>      pf (String, 0 characters )
> >>>>    *
> >>>>      debugQuery (String, 4 characters ) true
> >>>>    *
> >>>>      start (String, 1 characters ) 0
> >>>>    *
> >>>>      q (String, 5 characters ) mills
> >>>>    *
> >>>>      facet.field (Array, 2 elements) | (Callback) publisher::author();
> >>>>    *
> >>>>      qt (String, 11 characters ) partitioned
> >>>>
> >>>>
> >>>> <fieldType name="text_ws" class="solr.TextField"
> >>>> positionIncrementGap="100">
> >>>>      <analyzer type="index">
> >>>>        <charFilter class="solr.MappingCharFilterFactory"
> >>>> mapping="mapping-FoldToASCII.txt"/>
> >>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>>>        <filter class="solr.LowerCaseFilterFactory"/>
> >>>>        <filter class="solr.WordDelimiterFilterFactory"
> >>>>                generateWordParts="1"
> >>>>                generateNumberParts="0"
> >>>>                catenateWords="0"
> >>>>                catenateNumbers="0"
> >>>>                catenateAll="0"
> >>>>                splitOnCaseChange="0"
> >>>>                preserveOriginal="0"/>
> >>>>      </analyzer>
> >>>>      <analyzer type="query">
> >>>>        <charFilter class="solr.MappingCharFilterFactory"
> >>>> mapping="mapping-FoldToASCII.txt"/>
> >>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>>>        <filter class="solr.LowerCaseFilterFactory"/>
> >>>>        <filter class="solr.WordDelimiterFilterFactory"
> >>>>                generateWordParts="1"
> >>>>                generateNumberParts="0"
> >>>>                catenateWords="0"
> >>>>                catenateNumbers="0"
> >>>>                catenateAll="0"
> >>>>                splitOnCaseChange="0"
> >>>>                preserveOriginal="0"/>
> >>>>      </analyzer>
> >>>>    </fieldType>
> >>>>
> >>>> <field name="author_split" type="text_ws" indexed="true"
> >>>> multiValued="true"/>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
>
>

Re: Dismax parameters weirdness - boosting/mm causes no match ?

Posted by Marek Tichy <ma...@gn.apc.org>.
Hi Erick,
many many thanks. Headbang^80.0.
I was just getting to the same conclusion - that I'm not reaching
handler I think I am, when I got your message. As almost always, the
mystery has a simple explanation. Not only that I was not reaching the
handler I thought I was. In fact I was not reaching the server I thought
I was.
With web interface yes, with my app no. My app was still connected to
the server I have used previously, which now has an outdated index.
One line, 6 hours of debugging.

Many thanks for your prompt help !

Marek Tichy





> OK, something doesn't make sense here, but you knew that. I'm guessing
> that something you're leaving out when cutting/pasting will give us a clue.
> Here's what I'd like to try.
>
> 1> Show us the entire results from &debugQuery=on, as well as
> the URL. Please do this for both the query that works and the one
> that doesn't.
>
> 2> Take a look in Solr admin and see if what you think is in your
> author_split field really is there.
>
> Because this
>
> (author_split:joy^80.0)~1.0 ()
>
> usually indicates that you're not getting to the request handler you think
> you are.
>
> As I say, nothing jumps out at me, which just means I'm not looking at the
> right parts....
>
> Best
> Erick
>
> On Thu, Apr 14, 2011 at 4:01 PM, Marek Tichy <ma...@gn.apc.org> wrote:
>
>   
>> Hi Eric,
>> thanks for the reply.
>> I'm sending qt=partitioned, the request handler is defined as follows:
>>
>> <requestHandler name="partitioned" class="solr.SearchHandler"
>> default="true">
>>    <lst name="defaults">
>>     <str name="defType">edismax</str>
>>     <str name="echoParams">explicit</str>
>>     <float name="tie">0.1</float>
>>     <str name="qf">
>>        author_split^80.0 title^20.0 publisher_split^8.0
>>     </str>
>>     <str name="pf">
>>     </str>
>>     <int name="ps">15</int>
>>     <!-- Abort any searches longer than 4 seconds -->
>>     <!-- <int name="timeAllowed">4000</int>  -->
>>
>>     <!--str name="mm">
>>        2&lt;-35%
>>     </str-->
>>     <str name="q.alt">*:*</str>
>>
>>   <!-- example highlighter config, enable per-query with hl=true -->
>>     <str name="hl">true</str>
>>     <str name="hl.fl">synopsis,author,title,publisher,keywords</str>
>>     <int name="hl.snippets">3</int>
>>     <str name="hl.mergeContiguous">true</str>
>>   <!-- instructs Solr to return the field itself if no query terms are
>>        found -->
>>     <str name="f.body.hl.alternateField">synopsis</str>
>>     <str name="f.body.hl.maxAlternateFieldLength">256</str>
>>     <!-- JS: I wasn't getting good results here... I'm turning off for now
>>     because I was getting periods (.) by themselves at the beginning of
>>     snippets and don't feel like debugging anymore.  Without the regex is
>>     faster too -->
>>     <!--<str name="f.body.hl.fragmenter">regex</str>--> <!-- defined
>> below -->
>>
>>    <!-- By default, don't spell check -->
>>      <str name="spellcheck">false</str>
>>    <!-- Defaults for the spell checker when used -->
>>      <str name="spellcheck.onlyMorePopular">true</str>
>>      <str name="spellcheck.extendedResults">false</str>
>>      <!--  The number of suggestions to return -->
>>      <str name="spellcheck.count">1</str>
>>    </lst>
>>    <arr name="last-components">
>>      <str>spellcheck</str>
>>    </arr>
>>  </requestHandler>
>>
>>
>>     
>>> Let's see the raw query you send from your webapp.
>>>
>>> Based on this:
>>> +(author_split:joy^80.0)~1.0 ()
>>>
>>> I'm going to guess you send your query something like
>>> blahblahblah&defType=edismax&blahblahblah
>>>
>>> rather than
>>> blahblahblah&qt=edismax&blahblahblah
>>>
>>> The former expects you to define the all your parameters on the URL
>>> (e.g. the qf, mm, etc). The latter looks for a request handler named
>>> "edismax"
>>> in your solrconfig.xml.
>>>
>>> Which is really confusing....
>>>
>>> This may be irrlevant, but I know I stumbled over this and so I thought
>>>       
>> I'd
>>     
>>> mention it.
>>>
>>> Best
>>> Erick
>>>
>>> 2011/4/14 Marek Tichy <ma...@gn.apc.org>
>>>
>>>
>>>       
>>>> Hi,
>>>>
>>>> I'd really appreciate if somebody could help me.
>>>> Solr 3.1, I have a (multivalued) field called author_split, which splits
>>>> original author names, i.e. "Mills, Joy". When I use the solr web
>>>> interface to enter
>>>>
>>>>    author_split:mills
>>>>
>>>> the ExtendedDismaxQParser produces query
>>>>
>>>>    +author_split:mills
>>>>
>>>> which returns 12 results. When I try
>>>>
>>>>   +author_split:joy
>>>>
>>>> it returns 15 results (Fielding, Joy is responsible for the extra 3).
>>>> And now the mystery:
>>>>
>>>> When I use my web app, entering query mills produces
>>>> (parsedquery_toString):
>>>>
>>>>    +(author_split:mills^80.0)~1.0 ()
>>>>
>>>> and that gives NO RESULTS !
>>>>
>>>> However, the other search
>>>>
>>>>    +(author_split:joy^80.0)~1.0 ()
>>>>
>>>> returns the expected 15 results ?
>>>>
>>>> Many thanks for any reply, I'l loosing hope !
>>>>
>>>> Marek Tichy
>>>>
>>>>
>>>> PS: Below is the of parameters I'm sending alongside and the fieldtype
>>>> definition
>>>>
>>>>    *
>>>>      mm (String, 1 characters ) 1
>>>>    *
>>>>      facet (String, 2 characters ) on
>>>>    *
>>>>      facet.mincount (String, 1 characters ) 1
>>>>    *
>>>>      tie (String, 1 characters ) 1
>>>>    *
>>>>      qf (String, 17 characters ) author_split^80.0
>>>>    *
>>>>      json.nl (String, 3 characters ) map
>>>>    *
>>>>      wt (String, 4 characters ) json
>>>>    *
>>>>      version (String, 3 characters ) 1.2
>>>>    *
>>>>      rows (String, 2 characters ) 10
>>>>    *
>>>>      pf (String, 0 characters )
>>>>    *
>>>>      debugQuery (String, 4 characters ) true
>>>>    *
>>>>      start (String, 1 characters ) 0
>>>>    *
>>>>      q (String, 5 characters ) mills
>>>>    *
>>>>      facet.field (Array, 2 elements) | (Callback) publisher::author();
>>>>    *
>>>>      qt (String, 11 characters ) partitioned
>>>>
>>>>
>>>> <fieldType name="text_ws" class="solr.TextField"
>>>> positionIncrementGap="100">
>>>>      <analyzer type="index">
>>>>        <charFilter class="solr.MappingCharFilterFactory"
>>>> mapping="mapping-FoldToASCII.txt"/>
>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>        <filter class="solr.LowerCaseFilterFactory"/>
>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>>>                generateWordParts="1"
>>>>                generateNumberParts="0"
>>>>                catenateWords="0"
>>>>                catenateNumbers="0"
>>>>                catenateAll="0"
>>>>                splitOnCaseChange="0"
>>>>                preserveOriginal="0"/>
>>>>      </analyzer>
>>>>      <analyzer type="query">
>>>>        <charFilter class="solr.MappingCharFilterFactory"
>>>> mapping="mapping-FoldToASCII.txt"/>
>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>        <filter class="solr.LowerCaseFilterFactory"/>
>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>>>                generateWordParts="1"
>>>>                generateNumberParts="0"
>>>>                catenateWords="0"
>>>>                catenateNumbers="0"
>>>>                catenateAll="0"
>>>>                splitOnCaseChange="0"
>>>>                preserveOriginal="0"/>
>>>>      </analyzer>
>>>>    </fieldType>
>>>>
>>>> <field name="author_split" type="text_ws" indexed="true"
>>>> multiValued="true"/>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>       
>>     
>
>   


Re: Dismax parameters weirdness - boosting/mm causes no match ?

Posted by Erick Erickson <er...@gmail.com>.
OK, something doesn't make sense here, but you knew that. I'm guessing
that something you're leaving out when cutting/pasting will give us a clue.
Here's what I'd like to try.

1> Show us the entire results from &debugQuery=on, as well as
the URL. Please do this for both the query that works and the one
that doesn't.

2> Take a look in Solr admin and see if what you think is in your
author_split field really is there.

Because this

(author_split:joy^80.0)~1.0 ()

usually indicates that you're not getting to the request handler you think
you are.

As I say, nothing jumps out at me, which just means I'm not looking at the
right parts....

Best
Erick

On Thu, Apr 14, 2011 at 4:01 PM, Marek Tichy <ma...@gn.apc.org> wrote:

> Hi Eric,
> thanks for the reply.
> I'm sending qt=partitioned, the request handler is defined as follows:
>
> <requestHandler name="partitioned" class="solr.SearchHandler"
> default="true">
>    <lst name="defaults">
>     <str name="defType">edismax</str>
>     <str name="echoParams">explicit</str>
>     <float name="tie">0.1</float>
>     <str name="qf">
>        author_split^80.0 title^20.0 publisher_split^8.0
>     </str>
>     <str name="pf">
>     </str>
>     <int name="ps">15</int>
>     <!-- Abort any searches longer than 4 seconds -->
>     <!-- <int name="timeAllowed">4000</int>  -->
>
>     <!--str name="mm">
>        2&lt;-35%
>     </str-->
>     <str name="q.alt">*:*</str>
>
>   <!-- example highlighter config, enable per-query with hl=true -->
>     <str name="hl">true</str>
>     <str name="hl.fl">synopsis,author,title,publisher,keywords</str>
>     <int name="hl.snippets">3</int>
>     <str name="hl.mergeContiguous">true</str>
>   <!-- instructs Solr to return the field itself if no query terms are
>        found -->
>     <str name="f.body.hl.alternateField">synopsis</str>
>     <str name="f.body.hl.maxAlternateFieldLength">256</str>
>     <!-- JS: I wasn't getting good results here... I'm turning off for now
>     because I was getting periods (.) by themselves at the beginning of
>     snippets and don't feel like debugging anymore.  Without the regex is
>     faster too -->
>     <!--<str name="f.body.hl.fragmenter">regex</str>--> <!-- defined
> below -->
>
>    <!-- By default, don't spell check -->
>      <str name="spellcheck">false</str>
>    <!-- Defaults for the spell checker when used -->
>      <str name="spellcheck.onlyMorePopular">true</str>
>      <str name="spellcheck.extendedResults">false</str>
>      <!--  The number of suggestions to return -->
>      <str name="spellcheck.count">1</str>
>    </lst>
>    <arr name="last-components">
>      <str>spellcheck</str>
>    </arr>
>  </requestHandler>
>
>
> > Let's see the raw query you send from your webapp.
> >
> > Based on this:
> > +(author_split:joy^80.0)~1.0 ()
> >
> > I'm going to guess you send your query something like
> > blahblahblah&defType=edismax&blahblahblah
> >
> > rather than
> > blahblahblah&qt=edismax&blahblahblah
> >
> > The former expects you to define the all your parameters on the URL
> > (e.g. the qf, mm, etc). The latter looks for a request handler named
> > "edismax"
> > in your solrconfig.xml.
> >
> > Which is really confusing....
> >
> > This may be irrlevant, but I know I stumbled over this and so I thought
> I'd
> > mention it.
> >
> > Best
> > Erick
> >
> > 2011/4/14 Marek Tichy <ma...@gn.apc.org>
> >
> >
> >> Hi,
> >>
> >> I'd really appreciate if somebody could help me.
> >> Solr 3.1, I have a (multivalued) field called author_split, which splits
> >> original author names, i.e. "Mills, Joy". When I use the solr web
> >> interface to enter
> >>
> >>    author_split:mills
> >>
> >> the ExtendedDismaxQParser produces query
> >>
> >>    +author_split:mills
> >>
> >> which returns 12 results. When I try
> >>
> >>   +author_split:joy
> >>
> >> it returns 15 results (Fielding, Joy is responsible for the extra 3).
> >> And now the mystery:
> >>
> >> When I use my web app, entering query mills produces
> >> (parsedquery_toString):
> >>
> >>    +(author_split:mills^80.0)~1.0 ()
> >>
> >> and that gives NO RESULTS !
> >>
> >> However, the other search
> >>
> >>    +(author_split:joy^80.0)~1.0 ()
> >>
> >> returns the expected 15 results ?
> >>
> >> Many thanks for any reply, I'l loosing hope !
> >>
> >> Marek Tichy
> >>
> >>
> >> PS: Below is the of parameters I'm sending alongside and the fieldtype
> >> definition
> >>
> >>    *
> >>      mm (String, 1 characters ) 1
> >>    *
> >>      facet (String, 2 characters ) on
> >>    *
> >>      facet.mincount (String, 1 characters ) 1
> >>    *
> >>      tie (String, 1 characters ) 1
> >>    *
> >>      qf (String, 17 characters ) author_split^80.0
> >>    *
> >>      json.nl (String, 3 characters ) map
> >>    *
> >>      wt (String, 4 characters ) json
> >>    *
> >>      version (String, 3 characters ) 1.2
> >>    *
> >>      rows (String, 2 characters ) 10
> >>    *
> >>      pf (String, 0 characters )
> >>    *
> >>      debugQuery (String, 4 characters ) true
> >>    *
> >>      start (String, 1 characters ) 0
> >>    *
> >>      q (String, 5 characters ) mills
> >>    *
> >>      facet.field (Array, 2 elements) | (Callback) publisher::author();
> >>    *
> >>      qt (String, 11 characters ) partitioned
> >>
> >>
> >> <fieldType name="text_ws" class="solr.TextField"
> >> positionIncrementGap="100">
> >>      <analyzer type="index">
> >>        <charFilter class="solr.MappingCharFilterFactory"
> >> mapping="mapping-FoldToASCII.txt"/>
> >>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>        <filter class="solr.LowerCaseFilterFactory"/>
> >>        <filter class="solr.WordDelimiterFilterFactory"
> >>                generateWordParts="1"
> >>                generateNumberParts="0"
> >>                catenateWords="0"
> >>                catenateNumbers="0"
> >>                catenateAll="0"
> >>                splitOnCaseChange="0"
> >>                preserveOriginal="0"/>
> >>      </analyzer>
> >>      <analyzer type="query">
> >>        <charFilter class="solr.MappingCharFilterFactory"
> >> mapping="mapping-FoldToASCII.txt"/>
> >>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>        <filter class="solr.LowerCaseFilterFactory"/>
> >>        <filter class="solr.WordDelimiterFilterFactory"
> >>                generateWordParts="1"
> >>                generateNumberParts="0"
> >>                catenateWords="0"
> >>                catenateNumbers="0"
> >>                catenateAll="0"
> >>                splitOnCaseChange="0"
> >>                preserveOriginal="0"/>
> >>      </analyzer>
> >>    </fieldType>
> >>
> >> <field name="author_split" type="text_ws" indexed="true"
> >> multiValued="true"/>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
> >
>
>

Re: Dismax parameters weirdness - boosting/mm causes no match ?

Posted by Marek Tichy <ma...@gn.apc.org>.
Hi Eric,
thanks for the reply.
I'm sending qt=partitioned, the request handler is defined as follows:

<requestHandler name="partitioned" class="solr.SearchHandler"
default="true">
    <lst name="defaults">
     <str name="defType">edismax</str>
     <str name="echoParams">explicit</str>
     <float name="tie">0.1</float>
     <str name="qf">
        author_split^80.0 title^20.0 publisher_split^8.0
     </str>
     <str name="pf">
     </str>
     <int name="ps">15</int>
     <!-- Abort any searches longer than 4 seconds -->
     <!-- <int name="timeAllowed">4000</int>  -->
    
     <!--str name="mm">
        2&lt;-35%
     </str-->
     <str name="q.alt">*:*</str>

   <!-- example highlighter config, enable per-query with hl=true -->
     <str name="hl">true</str>
     <str name="hl.fl">synopsis,author,title,publisher,keywords</str>
     <int name="hl.snippets">3</int>
     <str name="hl.mergeContiguous">true</str>
   <!-- instructs Solr to return the field itself if no query terms are
        found -->
     <str name="f.body.hl.alternateField">synopsis</str>
     <str name="f.body.hl.maxAlternateFieldLength">256</str>
     <!-- JS: I wasn't getting good results here... I'm turning off for now
     because I was getting periods (.) by themselves at the beginning of
     snippets and don't feel like debugging anymore.  Without the regex is
     faster too -->
     <!--<str name="f.body.hl.fragmenter">regex</str>--> <!-- defined
below -->

    <!-- By default, don't spell check -->
      <str name="spellcheck">false</str>
    <!-- Defaults for the spell checker when used -->
      <str name="spellcheck.onlyMorePopular">true</str>
      <str name="spellcheck.extendedResults">false</str>
      <!--  The number of suggestions to return -->
      <str name="spellcheck.count">1</str>
    </lst>
    <arr name="last-components">
      <str>spellcheck</str>
    </arr>
  </requestHandler>


> Let's see the raw query you send from your webapp.
>
> Based on this:
> +(author_split:joy^80.0)~1.0 ()
>
> I'm going to guess you send your query something like
> blahblahblah&defType=edismax&blahblahblah
>
> rather than
> blahblahblah&qt=edismax&blahblahblah
>
> The former expects you to define the all your parameters on the URL
> (e.g. the qf, mm, etc). The latter looks for a request handler named
> "edismax"
> in your solrconfig.xml.
>
> Which is really confusing....
>
> This may be irrlevant, but I know I stumbled over this and so I thought I'd
> mention it.
>
> Best
> Erick
>
> 2011/4/14 Marek Tichy <ma...@gn.apc.org>
>
>   
>> Hi,
>>
>> I'd really appreciate if somebody could help me.
>> Solr 3.1, I have a (multivalued) field called author_split, which splits
>> original author names, i.e. "Mills, Joy". When I use the solr web
>> interface to enter
>>
>>    author_split:mills
>>
>> the ExtendedDismaxQParser produces query
>>
>>    +author_split:mills
>>
>> which returns 12 results. When I try
>>
>>   +author_split:joy
>>
>> it returns 15 results (Fielding, Joy is responsible for the extra 3).
>> And now the mystery:
>>
>> When I use my web app, entering query mills produces
>> (parsedquery_toString):
>>
>>    +(author_split:mills^80.0)~1.0 ()
>>
>> and that gives NO RESULTS !
>>
>> However, the other search
>>
>>    +(author_split:joy^80.0)~1.0 ()
>>
>> returns the expected 15 results ?
>>
>> Many thanks for any reply, I'l loosing hope !
>>
>> Marek Tichy
>>
>>
>> PS: Below is the of parameters I'm sending alongside and the fieldtype
>> definition
>>
>>    *
>>      mm (String, 1 characters ) 1
>>    *
>>      facet (String, 2 characters ) on
>>    *
>>      facet.mincount (String, 1 characters ) 1
>>    *
>>      tie (String, 1 characters ) 1
>>    *
>>      qf (String, 17 characters ) author_split^80.0
>>    *
>>      json.nl (String, 3 characters ) map
>>    *
>>      wt (String, 4 characters ) json
>>    *
>>      version (String, 3 characters ) 1.2
>>    *
>>      rows (String, 2 characters ) 10
>>    *
>>      pf (String, 0 characters )
>>    *
>>      debugQuery (String, 4 characters ) true
>>    *
>>      start (String, 1 characters ) 0
>>    *
>>      q (String, 5 characters ) mills
>>    *
>>      facet.field (Array, 2 elements) | (Callback) publisher::author();
>>    *
>>      qt (String, 11 characters ) partitioned
>>
>>
>> <fieldType name="text_ws" class="solr.TextField"
>> positionIncrementGap="100">
>>      <analyzer type="index">
>>        <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-FoldToASCII.txt"/>
>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>        <filter class="solr.LowerCaseFilterFactory"/>
>>        <filter class="solr.WordDelimiterFilterFactory"
>>                generateWordParts="1"
>>                generateNumberParts="0"
>>                catenateWords="0"
>>                catenateNumbers="0"
>>                catenateAll="0"
>>                splitOnCaseChange="0"
>>                preserveOriginal="0"/>
>>      </analyzer>
>>      <analyzer type="query">
>>        <charFilter class="solr.MappingCharFilterFactory"
>> mapping="mapping-FoldToASCII.txt"/>
>>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>        <filter class="solr.LowerCaseFilterFactory"/>
>>        <filter class="solr.WordDelimiterFilterFactory"
>>                generateWordParts="1"
>>                generateNumberParts="0"
>>                catenateWords="0"
>>                catenateNumbers="0"
>>                catenateAll="0"
>>                splitOnCaseChange="0"
>>                preserveOriginal="0"/>
>>      </analyzer>
>>    </fieldType>
>>
>> <field name="author_split" type="text_ws" indexed="true"
>> multiValued="true"/>
>>
>>
>>
>>
>>
>>
>>
>>     
>
>   


Re: Dismax parameters weirdness - boosting/mm causes no match ?

Posted by Erick Erickson <er...@gmail.com>.
Let's see the raw query you send from your webapp.

Based on this:
+(author_split:joy^80.0)~1.0 ()

I'm going to guess you send your query something like
blahblahblah&defType=edismax&blahblahblah

rather than
blahblahblah&qt=edismax&blahblahblah

The former expects you to define the all your parameters on the URL
(e.g. the qf, mm, etc). The latter looks for a request handler named
"edismax"
in your solrconfig.xml.

Which is really confusing....

This may be irrlevant, but I know I stumbled over this and so I thought I'd
mention it.

Best
Erick

2011/4/14 Marek Tichy <ma...@gn.apc.org>

> Hi,
>
> I'd really appreciate if somebody could help me.
> Solr 3.1, I have a (multivalued) field called author_split, which splits
> original author names, i.e. "Mills, Joy". When I use the solr web
> interface to enter
>
>    author_split:mills
>
> the ExtendedDismaxQParser produces query
>
>    +author_split:mills
>
> which returns 12 results. When I try
>
>   +author_split:joy
>
> it returns 15 results (Fielding, Joy is responsible for the extra 3).
> And now the mystery:
>
> When I use my web app, entering query mills produces
> (parsedquery_toString):
>
>    +(author_split:mills^80.0)~1.0 ()
>
> and that gives NO RESULTS !
>
> However, the other search
>
>    +(author_split:joy^80.0)~1.0 ()
>
> returns the expected 15 results ?
>
> Many thanks for any reply, I'l loosing hope !
>
> Marek Tichy
>
>
> PS: Below is the of parameters I'm sending alongside and the fieldtype
> definition
>
>    *
>      mm (String, 1 characters ) 1
>    *
>      facet (String, 2 characters ) on
>    *
>      facet.mincount (String, 1 characters ) 1
>    *
>      tie (String, 1 characters ) 1
>    *
>      qf (String, 17 characters ) author_split^80.0
>    *
>      json.nl (String, 3 characters ) map
>    *
>      wt (String, 4 characters ) json
>    *
>      version (String, 3 characters ) 1.2
>    *
>      rows (String, 2 characters ) 10
>    *
>      pf (String, 0 characters )
>    *
>      debugQuery (String, 4 characters ) true
>    *
>      start (String, 1 characters ) 0
>    *
>      q (String, 5 characters ) mills
>    *
>      facet.field (Array, 2 elements) | (Callback) publisher::author();
>    *
>      qt (String, 11 characters ) partitioned
>
>
> <fieldType name="text_ws" class="solr.TextField"
> positionIncrementGap="100">
>      <analyzer type="index">
>        <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-FoldToASCII.txt"/>
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.WordDelimiterFilterFactory"
>                generateWordParts="1"
>                generateNumberParts="0"
>                catenateWords="0"
>                catenateNumbers="0"
>                catenateAll="0"
>                splitOnCaseChange="0"
>                preserveOriginal="0"/>
>      </analyzer>
>      <analyzer type="query">
>        <charFilter class="solr.MappingCharFilterFactory"
> mapping="mapping-FoldToASCII.txt"/>
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.WordDelimiterFilterFactory"
>                generateWordParts="1"
>                generateNumberParts="0"
>                catenateWords="0"
>                catenateNumbers="0"
>                catenateAll="0"
>                splitOnCaseChange="0"
>                preserveOriginal="0"/>
>      </analyzer>
>    </fieldType>
>
> <field name="author_split" type="text_ws" indexed="true"
> multiValued="true"/>
>
>
>
>
>
>
>