You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Dyer, James" <Ja...@ingramcontent.com> on 2013/03/19 17:18:03 UTC

RE: strange behaviour of wordbreak spellchecker in solr cloud

Can you try including in your request the "shards.qt" parameter?  In your case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support for a brief discussion.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com] 
Sent: Monday, March 18, 2013 4:07 PM
To: solr-user@lucene.apache.org
Subject: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have two server with one shard in each of them.

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'

does not return any results in spellchecker. However, if I specify distrib=false only one of these has spellchecker results.

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'

no spellcheler results 

curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
returns spellcheker results.


My testhandler and select handlers are as follows


<requestHandler name="/testhandler" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">host^30  content^0.5 title^1.2 </str>
<str name="pf">site^25 content^10 title^22</str>
<str name="fl">url,id,title</str>
<!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
<str name="mm">3<-1 5<-3 6<90%</str>
<int name="ps">1</int>

<str name="hl">true</str>
<str name="hl.fl">content</str>
<str name="f.content.hl.fragmenter">regex</str>
<str name="hl.fragsize">165</str>
<str name="hl.fragmentsBuilder">default</str>


<str name="spellcheck.dictionary">direct</str>
<str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck">on</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.count">2</str>

</lst>

<arr name="last-components">
 <str>spellcheck</str>
</arr>

</requestHandler>


  <requestHandler name="/select" class="solr.SearchHandler">
    <!-- default values for query parameters can be specified, these
         will be overridden by parameters in the request
      -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <!-- <str name="df">text</str> -->
     </lst>
    <!-- In addition to defaults, "appends" params can be specified
         to identify values which should be appended to the list of
         multi-val params from the query (or the existing "defaults").
      -->
    <!-- In this example, the param "fq=instock:true" would be appended to
         any query time fq params the user may specify, as a mechanism for
         partitioning the index, independent of any user selected filtering
         that may also be desired (perhaps as a result of faceted searching).

         NOTE: there is *absolutely* nothing a client can do to prevent these
         "appends" values from being used, so don't use this mechanism
         unless you are sure you always want it.
      -->
    <!--
       <lst name="appends">
         <str name="fq">inStock:true</str>
       </lst>
      -->
    <!-- "invariants" are a way of letting the Solr maintainer lock down
         the options available to Solr clients.  Any params values
         specified here are used regardless of what values may be specified
         in either the query, the "defaults", or the "appends" params.

         In this example, the facet.field and facet.query params would
         be fixed, limiting the facets clients can use.  Faceting is
         not turned on by default - but if the client does specify
         facet=true in the request, these are the only facets they
         will be able to see counts for; regardless of what other
         facet.field or facet.query params they may specify.

         NOTE: there is *absolutely* nothing a client can do to prevent these
         "invariants" values from being used, so don't use this mechanism
         unless you are sure you always want it.
      -->
    <!--
       <lst name="invariants">
         <str name="facet.field">cat</str>
         <str name="facet.field">manu_exact</str>
         <str name="facet.query">price:[* TO 500]</str>
         <str name="facet.query">price:[500 TO *]</str>
       </lst>
      -->
    <!-- If the default list of SearchComponents is not desired, that
         list can either be overridden completely, or components can be
         prepended or appended to the default list.  (see below)
      -->
    <!--
       <arr name="components">
         <str>nameOfCustomComponent1</str>
         <str>nameOfCustomComponent2</str>
       </arr>
      -->
       <arr name="last-components">
         <str>spellcheck</str>
       </arr> 
    </requestHandler>



is this a bug or something else has to be done?


Thanks.
Alex.


Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by Mark Miller <ma...@gmail.com>.
On Mar 19, 2013, at 1:30 PM, "Dyer, James" <Ja...@ingramcontent.com> wrote:

> Mark,
> 
> I wasn't sure if Alex is actually testing /select, or if the problem is just coming up in /testhandler.  Just wanted to verify that before we get into bug reports.

Distributed search will use /select if you don't use shards.qt - so if you also have the component in /select, it's an alternative approach to shards.qt. I don't know what his problem is, I'm just saying I don't think shards.qt looks like the smoking gun out of the gate.

- Mark

> 
> DistributedSpellCheckComponentTest does have 1 little Word Break test scenario in it, so we know WordBreakSolrSpellChecker at least works some of the time in a Distributed environment :) .  Ideally, we should probably use a random test for stuff like this as adding a bunch of test scenarios would make this already-slower-than-molasses test even slower.  On the other hand, we want to test as many possibilities as we can.  Based on DSCCT and it being so superficial, I really can't vouch too much for my spell check enhancements working as well with shards as they do with a single index.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com] 
> Sent: Tuesday, March 19, 2013 11:49 AM
> To: solr-user@lucene.apache.org
> Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud
> 
> My first thought too, but then I saw that he had the spell component in both his custom testhander and the /select handler, so I'd expect that to work as well.
> 
> - Mark
> 
> On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> wrote:
> 
>> Can you try including in your request the "shards.qt" parameter?  In your case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support for a brief discussion.
>> 
>> James Dyer
>> Ingram Content Group
>> (615) 213-4311
>> 
>> 
>> -----Original Message-----
>> From: alxsss@aim.com [mailto:alxsss@aim.com] 
>> Sent: Monday, March 18, 2013 4:07 PM
>> To: solr-user@lucene.apache.org
>> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>> 
>> Hello,
>> 
>> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have two server with one shard in each of them.
>> 
>> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>> 
>> does not return any results in spellchecker. However, if I specify distrib=false only one of these has spellchecker results.
>> 
>> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>> 
>> no spellcheler results 
>> 
>> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>> returns spellcheker results.
>> 
>> 
>> My testhandler and select handlers are as follows
>> 
>> 
>> <requestHandler name="/testhandler" class="solr.SearchHandler" >
>> <lst name="defaults">
>> <str name="defType">edismax</str>
>> <str name="echoParams">explicit</str>
>> <float name="tie">0.01</float>
>> <str name="qf">host^30  content^0.5 title^1.2 </str>
>> <str name="pf">site^25 content^10 title^22</str>
>> <str name="fl">url,id,title</str>
>> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
>> <str name="mm">3<-1 5<-3 6<90%</str>
>> <int name="ps">1</int>
>> 
>> <str name="hl">true</str>
>> <str name="hl.fl">content</str>
>> <str name="f.content.hl.fragmenter">regex</str>
>> <str name="hl.fragsize">165</str>
>> <str name="hl.fragmentsBuilder">default</str>
>> 
>> 
>> <str name="spellcheck.dictionary">direct</str>
>> <str name="spellcheck.dictionary">wordbreak</str>
>> <str name="spellcheck">on</str>
>> <str name="spellcheck.collate">true</str>
>> <str name="spellcheck.onlyMorePopular">false</str>
>> <str name="spellcheck.count">2</str>
>> 
>> </lst>
>> 
>> <arr name="last-components">
>> <str>spellcheck</str>
>> </arr>
>> 
>> </requestHandler>
>> 
>> 
>> <requestHandler name="/select" class="solr.SearchHandler">
>>   <!-- default values for query parameters can be specified, these
>>        will be overridden by parameters in the request
>>     -->
>>    <lst name="defaults">
>>      <str name="echoParams">explicit</str>
>>      <int name="rows">10</int>
>>      <!-- <str name="df">text</str> -->
>>    </lst>
>>   <!-- In addition to defaults, "appends" params can be specified
>>        to identify values which should be appended to the list of
>>        multi-val params from the query (or the existing "defaults").
>>     -->
>>   <!-- In this example, the param "fq=instock:true" would be appended to
>>        any query time fq params the user may specify, as a mechanism for
>>        partitioning the index, independent of any user selected filtering
>>        that may also be desired (perhaps as a result of faceted searching).
>> 
>>        NOTE: there is *absolutely* nothing a client can do to prevent these
>>        "appends" values from being used, so don't use this mechanism
>>        unless you are sure you always want it.
>>     -->
>>   <!--
>>      <lst name="appends">
>>        <str name="fq">inStock:true</str>
>>      </lst>
>>     -->
>>   <!-- "invariants" are a way of letting the Solr maintainer lock down
>>        the options available to Solr clients.  Any params values
>>        specified here are used regardless of what values may be specified
>>        in either the query, the "defaults", or the "appends" params.
>> 
>>        In this example, the facet.field and facet.query params would
>>        be fixed, limiting the facets clients can use.  Faceting is
>>        not turned on by default - but if the client does specify
>>        facet=true in the request, these are the only facets they
>>        will be able to see counts for; regardless of what other
>>        facet.field or facet.query params they may specify.
>> 
>>        NOTE: there is *absolutely* nothing a client can do to prevent these
>>        "invariants" values from being used, so don't use this mechanism
>>        unless you are sure you always want it.
>>     -->
>>   <!--
>>      <lst name="invariants">
>>        <str name="facet.field">cat</str>
>>        <str name="facet.field">manu_exact</str>
>>        <str name="facet.query">price:[* TO 500]</str>
>>        <str name="facet.query">price:[500 TO *]</str>
>>      </lst>
>>     -->
>>   <!-- If the default list of SearchComponents is not desired, that
>>        list can either be overridden completely, or components can be
>>        prepended or appended to the default list.  (see below)
>>     -->
>>   <!--
>>      <arr name="components">
>>        <str>nameOfCustomComponent1</str>
>>        <str>nameOfCustomComponent2</str>
>>      </arr>
>>     -->
>>      <arr name="last-components">
>>        <str>spellcheck</str>
>>      </arr> 
>>   </requestHandler>
>> 
>> 
>> 
>> is this a bug or something else has to be done?
>> 
>> 
>> Thanks.
>> Alex.
>> 
> 
> 
> 


Fwd: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
 Thanks James.

I am investigating process function now. So far I have the following observation
It has the following code at the beginning

 SolrParams params = rb.req.getParams();
    if (!params.getBool(COMPONENT_NAME, false) || spellCheckers.isEmpty()) {
      return;
    }
    boolean shardRequest = "true".equals(params.get(ShardParams.IS_SHARD));
    String q = params.get(SPELLCHECK_Q);

I decided to output params, params.getBool(COMPONENT_NAME, false), spellCheckers.isEmpty() and q in cases when group=true and group=false

For the case when group=true (and no spellcheker results) we have
----------------------------------------------------------------------------------------------------

INFO: SOLR PARAMS={echoParams=explicit,rows=10,spellcheck=false,group.distributed.first=true,tie=0.01,f.content.hl.fragmenter=regex,distrib=false,hl=false,version=2,NOW=1363993692221,shard.url=192.168.1.4:8983/solr/test_shard2_replica1/,fl=id,score,spellchek=true,group.field=site,spellcheck.count=2,hl.fragsize=165,mm=3<-1 5<-3 6<90%,group.ngroups=true,qf=host^30  content^0.5 title^1.2 hl.fragmentsBuilder=default,hl.fl=content,wt=javabin,spellcheck.collate=true,defType=edismax,spellcheck.onlyMorePopular=false,rows=10,pf=site^25 content^10 title^22,start=0,q=paulusoles,spellcheck.dictionary=[Ljava.lang.String;@70533ff6,group=true,isShard=true,ps=1}


INFO: params.getBool(COMPONENT_NAME,false) = false
INFO: spellCheckers.isEmpty()=false

So the code does not pass below  the if statement

-------------------------------------------------------------------------------------------------------
CASE when group=false (spellchecker results are present)

 
INFO: SOLR PARAMS={echoParams=explicit,rows=10,spellcheck=true,tie=0.01,f.content.hl.fragmenter=regex,distrib=false,hl=false,version=2,NOW=1363994042221,shard.url=192.168.1.4:8983/solr/test_shard2_replica1/,fl=id,score,spellchek=true,group.field=site,spellcheck.count=5,fsv=true,hl.fragsize=165,mm=3<-1 5<-3 6<90%,group.ngroups=true,qf=host^30  content^0.5 title^1.2 ,hl.fragmentsBuilder=default,hl.fl=content,wt=javabin,spellcheck.collate=true,defType=edismax,spellcheck.onlyMorePopular=false,rows=10,pf=site^25 content^10 title^22,start=0,q=paulusoles,spellcheck.dictionary=[Ljava.lang.String;@5a2c25b4,group=false,isShard=true,ps=1}


INFO: params.getBool(COMPONENT_NAME,false) =true


INFO: spellCheckers.isEmpty()=false

INFO: SPELLCHECK=null


As you noticed in the SOLR_PARAMS list in case when group=true the first spellcheker=false and I think this is the reason why there is no spellchecker results.

Any ideas how these params are constructed and sent to process function?

Thanks.
Alex.



 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Fri, Mar 22, 2013 3:00 pm
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Alex,

You may want to move over to the dev user's list now that you're working on 
code.  Or if you would rather not subscribe to the dev-list, add yourself as a 
watcher to SOLR-3758 and comment further there.  This will help us keep track on 
progress for the issue.

The short answer is that in a distributed set-up SpellCheckComponent (and 
others) work in 2 phases.  In the first phase, each shard is sent the request 
almost as if they were a complete (non-distributed) index each to its own self.  
The difference is that an additional parameter is added to the request 
indicating that this is the first phase of a distributed request.  In 
SpellCheckComponent, it uses this knowledge to include additional information in 
the response that normally wouldn't go out to an end client.  The first phase 
calls the Component's process() method, just as would be done if this was a 
non-distributed call.

In the second phase, the initiating shard collects the response from all of the 
shards' process() methods and combines them.  This is where finishStage() is 
called.  So while process() runs in parallel on all of the shards, finishStage() 
runs only on the initiating shard, after the various shards have returned their 
responses.

The code you found in SearchHandler is what coordinates all of these activities.  
It is very complicated code, but honestly you probably will not need to 
understand it to fix this.

What you probably will find is that each shard's process() returns the correct 
result, just as you get with your hand-done testing.  But somehow finishStage() 
does not properly combine the responses when grouping is involved.  It might be 
that the responses come back just a little differently and finishStage() cannot 
cope, or something along those lines.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Thanks.

I can fix this, but going over code it seems it is not easy to figure out where 
the whole request and response come from.

I followed up  SpellCheckComponent#finishStage


 and found out that SearchHandler#handleRequestBody calls this function. 
However, which part calls handleRequestBody and how its arguments are 
constructed is not clear.


Thanks.
Alex.



-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Fri, Mar 22, 2013 2:08 pm
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Alex,

I added your comments to SOLR-3758 (https://issues.apache.org/jira/browse/SOLR-3758)
, which seems to me to be the very same issue.

If you need this to work now and if you cannot devise a fix yourself, then
perhaps a workaround is if the query returns with 0 results, re-issue the query
with "&rows=0&group=false" (you would omit all other optional components also).
This will give you back just a spell check result.  I realize this is not
optimal because it requires the overhead of issuing 2 queries but if you do it
only in instances the user gets nothing (or very little) back maybe it would be
tolerable?  Then once a viable fix is devised you can remove the extra code from
your application.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 12:53 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,


Further investigation shows the following pattern, for both DirectIndex and
wordbreak spellchekers.

Assume that in all cases there are spellchecker results when distrib=false

In distributed mode (distrib=true)
  case when matches=0
    1. group=true,  no spellcheck results

    2. group=false , there are spellcheck results

  case when matches>0
    1. group=true, there are spellcheck results
    2. group =false, there are spellcheck results


Do these constitute a failing test case?

Thanks.
Alex.





-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 6:50 pm
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud



Hello,

I am debugging the SpellCheckComponent#finishStage.

>From the responses I see that not only wordbreak, but also directSpellchecker
does not return some results in distributed mode.
The request handler I was using had

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed
mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.





-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR

.  If you write a failing unit test, it would make it much more likely that
others would help you with a fix.  Of course, if you solve the issue entirely, a

patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is
responsible for combining spellcheck results from all shards. I will try to
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>





















 

RE: strange behaviour of wordbreak spellchecker in solr cloud

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
Alex,

You may want to move over to the dev user's list now that you're working on code.  Or if you would rather not subscribe to the dev-list, add yourself as a watcher to SOLR-3758 and comment further there.  This will help us keep track on progress for the issue.

The short answer is that in a distributed set-up SpellCheckComponent (and others) work in 2 phases.  In the first phase, each shard is sent the request almost as if they were a complete (non-distributed) index each to its own self.  The difference is that an additional parameter is added to the request indicating that this is the first phase of a distributed request.  In SpellCheckComponent, it uses this knowledge to include additional information in the response that normally wouldn't go out to an end client.  The first phase calls the Component's process() method, just as would be done if this was a non-distributed call.

In the second phase, the initiating shard collects the response from all of the shards' process() methods and combines them.  This is where finishStage() is called.  So while process() runs in parallel on all of the shards, finishStage() runs only on the initiating shard, after the various shards have returned their responses.

The code you found in SearchHandler is what coordinates all of these activities.  It is very complicated code, but honestly you probably will not need to understand it to fix this.

What you probably will find is that each shard's process() returns the correct result, just as you get with your hand-done testing.  But somehow finishStage() does not properly combine the responses when grouping is involved.  It might be that the responses come back just a little differently and finishStage() cannot cope, or something along those lines.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Thanks.

I can fix this, but going over code it seems it is not easy to figure out where the whole request and response come from.

I followed up  SpellCheckComponent#finishStage


 and found out that SearchHandler#handleRequestBody calls this function. However, which part calls handleRequestBody and how its arguments are constructed is not clear.


Thanks.
Alex.



-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Fri, Mar 22, 2013 2:08 pm
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Alex,

I added your comments to SOLR-3758 (https://issues.apache.org/jira/browse/SOLR-3758)
, which seems to me to be the very same issue.

If you need this to work now and if you cannot devise a fix yourself, then
perhaps a workaround is if the query returns with 0 results, re-issue the query
with "&rows=0&group=false" (you would omit all other optional components also).
This will give you back just a spell check result.  I realize this is not
optimal because it requires the overhead of issuing 2 queries but if you do it
only in instances the user gets nothing (or very little) back maybe it would be
tolerable?  Then once a viable fix is devised you can remove the extra code from
your application.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 12:53 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,


Further investigation shows the following pattern, for both DirectIndex and
wordbreak spellchekers.

Assume that in all cases there are spellchecker results when distrib=false

In distributed mode (distrib=true)
  case when matches=0
    1. group=true,  no spellcheck results

    2. group=false , there are spellcheck results

  case when matches>0
    1. group=true, there are spellcheck results
    2. group =false, there are spellcheck results


Do these constitute a failing test case?

Thanks.
Alex.





-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 6:50 pm
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud



Hello,

I am debugging the SpellCheckComponent#finishStage.

>From the responses I see that not only wordbreak, but also directSpellchecker
does not return some results in distributed mode.
The request handler I was using had

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed
mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.





-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR

.  If you write a failing unit test, it would make it much more likely that
others would help you with a fix.  Of course, if you solve the issue entirely, a

patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is
responsible for combining spellcheck results from all shards. I will try to
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>





















Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
Thanks.

I can fix this, but going over code it seems it is not easy to figure out where the whole request and response come from.

I followed up  SpellCheckComponent#finishStage
 

 and found out that SearchHandler#handleRequestBody calls this function. However, which part calls handleRequestBody and how its arguments are constructed is not clear.


Thanks.
Alex.

 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Fri, Mar 22, 2013 2:08 pm
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Alex,

I added your comments to SOLR-3758 (https://issues.apache.org/jira/browse/SOLR-3758) 
, which seems to me to be the very same issue.

If you need this to work now and if you cannot devise a fix yourself, then 
perhaps a workaround is if the query returns with 0 results, re-issue the query 
with "&rows=0&group=false" (you would omit all other optional components also).  
This will give you back just a spell check result.  I realize this is not 
optimal because it requires the overhead of issuing 2 queries but if you do it 
only in instances the user gets nothing (or very little) back maybe it would be 
tolerable?  Then once a viable fix is devised you can remove the extra code from 
your application.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 12:53 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,


Further investigation shows the following pattern, for both DirectIndex and 
wordbreak spellchekers.

Assume that in all cases there are spellchecker results when distrib=false

In distributed mode (distrib=true)
  case when matches=0
    1. group=true,  no spellcheck results

    2. group=false , there are spellcheck results

  case when matches>0
    1. group=true, there are spellcheck results
    2. group =false, there are spellcheck results


Do these constitute a failing test case?

Thanks.
Alex.





-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 6:50 pm
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud



Hello,

I am debugging the SpellCheckComponent#finishStage.

>From the responses I see that not only wordbreak, but also directSpellchecker
does not return some results in distributed mode.
The request handler I was using had

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed
mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.





-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR

.  If you write a failing unit test, it would make it much more likely that
others would help you with a fix.  Of course, if you solve the issue entirely, a

patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is
responsible for combining spellcheck results from all shards. I will try to
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>


















 

RE: strange behaviour of wordbreak spellchecker in solr cloud

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
Alex,

I added your comments to SOLR-3758 (https://issues.apache.org/jira/browse/SOLR-3758) , which seems to me to be the very same issue.

If you need this to work now and if you cannot devise a fix yourself, then perhaps a workaround is if the query returns with 0 results, re-issue the query with "&rows=0&group=false" (you would omit all other optional components also).  This will give you back just a spell check result.  I realize this is not optimal because it requires the overhead of issuing 2 queries but if you do it only in instances the user gets nothing (or very little) back maybe it would be tolerable?  Then once a viable fix is devised you can remove the extra code from your application.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Friday, March 22, 2013 12:53 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,


Further investigation shows the following pattern, for both DirectIndex and wordbreak spellchekers.

Assume that in all cases there are spellchecker results when distrib=false

In distributed mode (distrib=true)
  case when matches=0
    1. group=true,  no spellcheck results

    2. group=false , there are spellcheck results

  case when matches>0
    1. group=true, there are spellcheck results
    2. group =false, there are spellcheck results


Do these constitute a failing test case?

Thanks.
Alex.





-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 6:50 pm
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud



Hello,

I am debugging the SpellCheckComponent#finishStage.

>From the responses I see that not only wordbreak, but also directSpellchecker
does not return some results in distributed mode.
The request handler I was using had

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed
mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.





-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR

.  If you write a failing unit test, it would make it much more likely that
others would help you with a fix.  Of course, if you solve the issue entirely, a

patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is
responsible for combining spellcheck results from all shards. I will try to
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>


















Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
Hello,


Further investigation shows the following pattern, for both DirectIndex and wordbreak spellchekers.

Assume that in all cases there are spellchecker results when distrib=false

In distributed mode (distrib=true)
  case when matches=0
    1. group=true,  no spellcheck results

    2. group=false , there are spellcheck results

  case when matches>0
    1. group=true, there are spellcheck results
    2. group =false, there are spellcheck results


Do these constitute a failing test case?

Thanks.
Alex.

 

 

-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 6:50 pm
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud



Hello,

I am debugging the SpellCheckComponent#finishStage. 
 
>From the responses I see that not only wordbreak, but also directSpellchecker 
does not return some results in distributed mode. 
The request handler I was using had 

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed 
mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results 
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.
 




-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly 
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR 

.  If you write a failing unit test, it would make it much more likely that 
others would help you with a fix.  Of course, if you solve the issue entirely, a 

patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is 
responsible for combining spellcheck results from all shards. I will try to 
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>













 

 

Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
Hello,

I am debugging the SpellCheckComponent#finishStage. 
 
>From the responses I see that not only wordbreak, but also directSpellchecker does not return some results in distributed mode. 
The request handler I was using had 

<str name="group">true</str>


So, I desided to turn of grouping and I see spellcheck results in distributed mode.


curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
has no spellchek results 
but

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
&group=false'
returns results.

So, the conclusion is that grouping causes the distributed spellcheker to fail.

Could please you point me to the class that may be responsible to this issue?

Thanks.
Alex.
 




-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Mar 21, 2013 11:23 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


The shard responses get combined in SpellCheckComponent#finishStage .  I highly 
recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR 
.  If you write a failing unit test, it would make it much more likely that 
others would help you with a fix.  Of course, if you solve the issue entirely, a 
patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is 
responsible for combining spellcheck results from all shards. I will try to 
debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>













 

RE: strange behaviour of wordbreak spellchecker in solr cloud

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
The shard responses get combined in SpellCheckComponent#finishStage .  I highly recommend you file a JIRA bug report for this at https://issues.apache.org/jira/browse/SOLR .  If you write a failing unit test, it would make it much more likely that others would help you with a fix.  Of course, if you solve the issue entirely, a patch would be much appreciated.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Thursday, March 21, 2013 12:45 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

We need this feature be fixed ASAP. So, please let me know which class is responsible for combining spellcheck results from all shards. I will try to debug the code.

Thanks in advance.
Alex.







-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a
distributed environment.  But to nail it down, we probably need to see both the
applicable <requestHandler /> section of your config and also this section:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also
need an example of a query that succeeds non-distributed (with the exact query
url and output you get) vs the same query url and output in the distributed
scenario.  Then, without access to your actual index, it might be possible to
come up with a failing unit test.  With a failing unit test in hand, we have a
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com]
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in

cloud. After I added

  <arr name="last-components">
     <str>spellcheck</str>
   </arr>
to /select requestHandler it worked but the wordbreak spellchecker. I have added

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.







-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just
coming up in /testhandler.  Just wanted to verify that before we get into bug
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a


Distributed environment :) .  Ideally, we should probably use a random test for
stuff like this as adding a bunch of test scenarios would make this
already-slower-than-molasses test even slower.  On the other hand, we want to
test as many possibilities as we can.  Based on DSCCT and it being so
superficial, I really can't vouch too much for my spell check enhancements
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com]
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com>
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support


for a brief discussion.
>
> James Dyer
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
>
> Hello,
>
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have
two server with one shard in each of them.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
>
> does not return any results in spellchecker. However, if I specify
distrib=false only one of these has spellchecker results.
>
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
>
> no spellcheler results
>
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
>
>
> My testhandler and select handlers are as follows
>
>
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
>
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
>
>
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
>
> </lst>
>
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
>
> </requestHandler>
>
>
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
>
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
>
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr>
>    </requestHandler>
>
>
>
> is this a bug or something else has to be done?
>
>
> Thanks.
> Alex.
>













Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
Hello,

We need this feature be fixed ASAP. So, please let me know which class is responsible for combining spellcheck results from all shards. I will try to debug the code.

Thanks in advance.
Alex.

 

 

 

-----Original Message-----
From: alxsss <al...@aim.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:34 am
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud


-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal 
levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck 
suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 
-->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be 
considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler 
below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>       
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>    
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a 
distributed environment.  But to nail it down, we probably need to see both the 
applicable <requestHandler /> section of your config and also this section: 
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also 
need an example of a query that succeeds non-distributed (with the exact query 
url and output you get) vs the same query url and output in the distributed 
scenario.  Then, without access to your actual index, it might be possible to 
come up with a failing unit test.  With a failing unit test in hand, we have a 
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com] 
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in 

cloud. After I added 

  <arr name="last-components">
     <str>spellcheck</str>
   </arr> 
to /select requestHandler it worked but the wordbreak spellchecker. I have added 

shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just 
coming up in /testhandler.  Just wanted to verify that before we get into bug 
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario 
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a 


Distributed environment :) .  Ideally, we should probably use a random test for 
stuff like this as adding a bunch of test scenarios would make this 
already-slower-than-molasses test even slower.  On the other hand, we want to 
test as many possibilities as we can.  Based on DSCCT and it being so 
superficial, I really can't vouch too much for my spell check enhancements 
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his 


custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> 
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your 
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support 


for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have 
two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify 
distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
> 




 
 


 

 

Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
-- distributed environment.  But to nail it down, we probably need to see both
-- the applicable <requestHandler />

Not sure what this is?

I have

 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

    <str name="queryAnalyzerFieldType">spell</str>

    <!-- Multiple "Spell Checkers" can be declared and used by this
         component
      -->

    <!-- a spellchecker built from a field of the main index -->
    <lst name="spellchecker">
      <str name="name">direct</str>
      <str name="field">spell</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <!-- the spellcheck distance measure used, the default is the internal levenshtein -->
      <str name="distanceMeasure">internal</str>
      <!-- minimum accuracy needed to be considered a valid spellcheck suggestion -->
      <float name="accuracy">0.5</float>
      <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -->
      <int name="maxEdits">2</int>
      <!-- the minimum shared prefix when enumerating terms -->
      <int name="minPrefix">1</int>
      <!-- maximum number of inspections per result. -->
      <int name="maxInspections">5</int>
      <!-- minimum length of a query term to be considered for correction -->
      <int name="minQueryLength">4</int>
      <!-- maximum threshold of documents a query term can appear to be considered for correction -->
      <float name="maxQueryFrequency">0.01</float>
      <!-- uncomment this to require suggestions to occur in 1% of the documents
        <float name="thresholdTokenFrequency">.01</float>
      -->
    </lst>

    <!-- a spellchecker that can break or combine words.  See "/spell" handler below for usage -->
    <lst name="spellchecker">
      <str name="name">wordbreak</str>
      <str name="classname">solr.WordBreakSolrSpellChecker</str>
      <str name="field">spell</str>
      <str name="combineWords">true</str>
      <str name="breakWords">true</str>
      <int name="maxChanges">10</int>
    </lst>

    <!-- a spellchecker that uses a different distance measure -->
    <!--
       <lst name="spellchecker">
         <str name="name">jarowinkler</str>
         <str name="field">spell</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="distanceMeasure">
           org.apache.lucene.search.spell.JaroWinklerDistance
         </str>
       </lst>
     -->
 <!-- a spellchecker that use an alternate comparator

         comparatorClass be one of:
          1. score (default)
          2. freq (Frequency first, then score)
          3. A fully qualified class name
      -->
    <!--
       <lst name="spellchecker">
         <str name="name">freq</str>
         <str name="field">lowerfilt</str>
         <str name="classname">solr.DirectSolrSpellChecker</str>
         <str name="comparatorClass">freq</str>
      -->

    <!-- A spellchecker that reads the list of words from a file -->
    <!--
       <lst name="spellchecker">
         <str name="classname">solr.FileBasedSpellChecker</str>
         <str name="name">file</str>
         <str name="sourceLocation">spellings.txt</str>
         <str name="characterEncoding">UTF-8</str>
         <str name="spellcheckIndexDir">spellcheckerFile</str>
       </lst>
      -->
  </searchComponent>


spell filed in our schema is called spell and its type also is called spell.
Here are requests


 curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">32</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>




 curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler&distrib=false'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">26</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions">
    <lst name="paulusoles">
      <int name="numFound">1</int>
      <int name="startOffset">0</int>
      <int name="endOffset">11</int>
      <arr name="suggestion">
        <str>paul u soles</str>       
      </arr>
    </lst>
    <str name="collation">(paul u soles)</str>
  </lst>
</lst>
</response>

No distrib param

curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>
    <str name="distrib">false</str>
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>
</response>


curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">24</int>
  <lst name="params">
    <str name="indent">true</str>
    <str name="shards.qt">testhandler</str>
    <str name="q">paulusoles</str>    
    <str name="rows">10</str>
  </lst>
</lst>
<lst name="grouped">
  <lst name="site">
    <int name="matches">0</int>
    <int name="ngroups">0</int>
    <arr name="groups"/>
  </lst>
</lst>
<lst name="highlighting"/>
<lst name="spellcheck">
  <lst name="suggestions"/>
</lst>

</response>

Thanks.
Alex.

---Original Message-----

From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 11:10 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


You may likely be hitting on a bug with WordBreakSolrSpellChecker in a 
distributed environment.  But to nail it down, we probably need to see both the 
applicable <requestHandler /> section of your config and also this section: 
<searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also 
need an example of a query that succeeds non-distributed (with the exact query 
url and output you get) vs the same query url and output in the distributed 
scenario.  Then, without access to your actual index, it might be possible to 
come up with a failing unit test.  With a failing unit test in hand, we have a 
good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com] 
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in 
cloud. After I added 

  <arr name="last-components">
     <str>spellcheck</str>
   </arr> 
to /select requestHandler it worked but the wordbreak spellchecker. I have added 
shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just 
coming up in /testhandler.  Just wanted to verify that before we get into bug 
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario 
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a 

Distributed environment :) .  Ideally, we should probably use a random test for 
stuff like this as adding a bunch of test scenarios would make this 
already-slower-than-molasses test even slower.  On the other hand, we want to 
test as many possibilities as we can.  Based on DSCCT and it being so 
superficial, I really can't vouch too much for my spell check enhancements 
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his 

custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> 
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your 
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support 

for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have 
two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify 
distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
> 




 
 


 

RE: strange behaviour of wordbreak spellchecker in solr cloud

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
You may likely be hitting on a bug with WordBreakSolrSpellChecker in a distributed environment.  But to nail it down, we probably need to see both the applicable <requestHandler /> section of your config and also this section: <searchComponent name="spellcheck" class="solr.SpellCheckComponent" />.  Also need an example of a query that succeeds non-distributed (with the exact query url and output you get) vs the same query url and output in the distributed scenario.  Then, without access to your actual index, it might be possible to come up with a failing unit test.  With a failing unit test in hand, we have a good shot at getting a fix.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com] 
Sent: Tuesday, March 19, 2013 12:39 PM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in cloud. After I added 

  <arr name="last-components">
     <str>spellcheck</str>
   </arr> 
to /select requestHandler it worked but the wordbreak spellchecker. I have added shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just 
coming up in /testhandler.  Just wanted to verify that before we get into bug 
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario 
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a 
Distributed environment :) .  Ideally, we should probably use a random test for 
stuff like this as adding a bunch of test scenarios would make this 
already-slower-than-molasses test even slower.  On the other hand, we want to 
test as many possibilities as we can.  Based on DSCCT and it being so 
superficial, I really can't vouch too much for my spell check enhancements 
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his 
custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> 
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your 
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support 
for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have 
two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify 
distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
> 




 
 


Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by al...@aim.com.
Hello,

I was testing my custom testhandler. Direct spellchecker also was not working in cloud. After I added 

  <arr name="last-components">
     <str>spellcheck</str>
   </arr> 
to /select requestHandler it worked but the wordbreak spellchecker. I have added shards.qt=testhanlder to curl request but it did not solve the issue.

Thanks.
Alex.

 

 

 

-----Original Message-----
From: Dyer, James <Ja...@ingramcontent.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, Mar 19, 2013 10:30 am
Subject: RE: strange behaviour of wordbreak spellchecker in solr cloud


Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just 
coming up in /testhandler.  Just wanted to verify that before we get into bug 
reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario 
in it, so we know WordBreakSolrSpellChecker at least works some of the time in a 
Distributed environment :) .  Ideally, we should probably use a random test for 
stuff like this as adding a bunch of test scenarios would make this 
already-slower-than-molasses test even slower.  On the other hand, we want to 
test as many possibilities as we can.  Based on DSCCT and it being so 
superficial, I really can't vouch too much for my spell check enhancements 
working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his 
custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> 
wrote:

> Can you try including in your request the "shards.qt" parameter?  In your 
case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support 
for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have 
two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify 
distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
> 




 
 

RE: strange behaviour of wordbreak spellchecker in solr cloud

Posted by "Dyer, James" <Ja...@ingramcontent.com>.
Mark,

I wasn't sure if Alex is actually testing /select, or if the problem is just coming up in /testhandler.  Just wanted to verify that before we get into bug reports.

DistributedSpellCheckComponentTest does have 1 little Word Break test scenario in it, so we know WordBreakSolrSpellChecker at least works some of the time in a Distributed environment :) .  Ideally, we should probably use a random test for stuff like this as adding a bunch of test scenarios would make this already-slower-than-molasses test even slower.  On the other hand, we want to test as many possibilities as we can.  Based on DSCCT and it being so superficial, I really can't vouch too much for my spell check enhancements working as well with shards as they do with a single index.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Tuesday, March 19, 2013 11:49 AM
To: solr-user@lucene.apache.org
Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

My first thought too, but then I saw that he had the spell component in both his custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> wrote:

> Can you try including in your request the "shards.qt" parameter?  In your case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
> 




Re: strange behaviour of wordbreak spellchecker in solr cloud

Posted by Mark Miller <ma...@gmail.com>.
My first thought too, but then I saw that he had the spell component in both his custom testhander and the /select handler, so I'd expect that to work as well.

- Mark

On Mar 19, 2013, at 12:18 PM, "Dyer, James" <Ja...@ingramcontent.com> wrote:

> Can you try including in your request the "shards.qt" parameter?  In your case, I think you should set it to "testhandler".  See http://wiki.apache.org/solr/SpellCheckComponent?highlight=%28shards\.qt%29#Distributed_Search_Support for a brief discussion.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com] 
> Sent: Monday, March 18, 2013 4:07 PM
> To: solr-user@lucene.apache.org
> Subject: strange behaviour of wordbreak spellchecker in solr cloud
> 
> Hello,
> 
> I try to use wordbreak spellchecker in solr-4.2 with cloud feature. We have two server with one shard in each of them.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10'
> 
> does not return any results in spellchecker. However, if I specify distrib=false only one of these has spellchecker results.
> 
> curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> 
> no spellcheler results 
> 
> curl 'server2:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&distrib=false'
> returns spellcheker results.
> 
> 
> My testhandler and select handlers are as follows
> 
> 
> <requestHandler name="/testhandler" class="solr.SearchHandler" >
> <lst name="defaults">
> <str name="defType">edismax</str>
> <str name="echoParams">explicit</str>
> <float name="tie">0.01</float>
> <str name="qf">host^30  content^0.5 title^1.2 </str>
> <str name="pf">site^25 content^10 title^22</str>
> <str name="fl">url,id,title</str>
> <!-- <str name="mm">2<-1 5<-3 6<90%</str> -->
> <str name="mm">3<-1 5<-3 6<90%</str>
> <int name="ps">1</int>
> 
> <str name="hl">true</str>
> <str name="hl.fl">content</str>
> <str name="f.content.hl.fragmenter">regex</str>
> <str name="hl.fragsize">165</str>
> <str name="hl.fragmentsBuilder">default</str>
> 
> 
> <str name="spellcheck.dictionary">direct</str>
> <str name="spellcheck.dictionary">wordbreak</str>
> <str name="spellcheck">on</str>
> <str name="spellcheck.collate">true</str>
> <str name="spellcheck.onlyMorePopular">false</str>
> <str name="spellcheck.count">2</str>
> 
> </lst>
> 
> <arr name="last-components">
> <str>spellcheck</str>
> </arr>
> 
> </requestHandler>
> 
> 
>  <requestHandler name="/select" class="solr.SearchHandler">
>    <!-- default values for query parameters can be specified, these
>         will be overridden by parameters in the request
>      -->
>     <lst name="defaults">
>       <str name="echoParams">explicit</str>
>       <int name="rows">10</int>
>       <!-- <str name="df">text</str> -->
>     </lst>
>    <!-- In addition to defaults, "appends" params can be specified
>         to identify values which should be appended to the list of
>         multi-val params from the query (or the existing "defaults").
>      -->
>    <!-- In this example, the param "fq=instock:true" would be appended to
>         any query time fq params the user may specify, as a mechanism for
>         partitioning the index, independent of any user selected filtering
>         that may also be desired (perhaps as a result of faceted searching).
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "appends" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="appends">
>         <str name="fq">inStock:true</str>
>       </lst>
>      -->
>    <!-- "invariants" are a way of letting the Solr maintainer lock down
>         the options available to Solr clients.  Any params values
>         specified here are used regardless of what values may be specified
>         in either the query, the "defaults", or the "appends" params.
> 
>         In this example, the facet.field and facet.query params would
>         be fixed, limiting the facets clients can use.  Faceting is
>         not turned on by default - but if the client does specify
>         facet=true in the request, these are the only facets they
>         will be able to see counts for; regardless of what other
>         facet.field or facet.query params they may specify.
> 
>         NOTE: there is *absolutely* nothing a client can do to prevent these
>         "invariants" values from being used, so don't use this mechanism
>         unless you are sure you always want it.
>      -->
>    <!--
>       <lst name="invariants">
>         <str name="facet.field">cat</str>
>         <str name="facet.field">manu_exact</str>
>         <str name="facet.query">price:[* TO 500]</str>
>         <str name="facet.query">price:[500 TO *]</str>
>       </lst>
>      -->
>    <!-- If the default list of SearchComponents is not desired, that
>         list can either be overridden completely, or components can be
>         prepended or appended to the default list.  (see below)
>      -->
>    <!--
>       <arr name="components">
>         <str>nameOfCustomComponent1</str>
>         <str>nameOfCustomComponent2</str>
>       </arr>
>      -->
>       <arr name="last-components">
>         <str>spellcheck</str>
>       </arr> 
>    </requestHandler>
> 
> 
> 
> is this a bug or something else has to be done?
> 
> 
> Thanks.
> Alex.
>