You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Varun Gupta <va...@gmail.com> on 2009/10/13 05:06:30 UTC

SpellCheck Index not building

Hi,

I am using Solr 1.3 for spell checking. I am facing a strange problem of
spell checking index not been generated. When I have less number of
documents (less than 1000) indexed then the spell check index builds, but
when the documents are more (around 40K), then the index for spell checking
does not build. I can see the directory for spell checking build and there
are two files under it: segments_3  & segments.gen

I am using the following query to build the spell checking index:
/select
params={spellcheck=true&start=0&qt=contentsearch&wt=xml&rows=0&spellcheck.build=true&version=2.2

In the logs I see:
INFO: [] webapp=/solr path=/select
params={spellcheck=true&start=0&qt=contentsearch&wt=xml&rows=0&spellcheck.build=true&version=2.2}
hits=37467 status=0 QTime=44

Please help me solve this problem.

Here is my configuration:
*schema.xml:*
    <fieldType name="textSpell" class="solr.TextField"
positionIncrementGap="100" stored="false" multiValued="true">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
   <field name="a_spell" type="textSpell" />
   <copyField source="title" dest="a_spell" />
   <copyField source="content" dest="a_spell" />

*solrconfig.xml:*
  <requestHandler name="contentsearch" class="solr.DisMaxRequestHandler" >
    <lst name="defaults">
     <str name="defType">dismax</str>

      <str name="spellcheck.onlyMorePopular">false</str>
      <str name="spellcheck.extendedResults">false</str>
      <str name="spellcheck.count">5</str>
      <str name="spellcheck.collate">true</str>
      <str name="spellcheck.dictionary">jarowinkler</str>
    </lst>
    <arr name="last-components">
        <str>spellcheck</str>
    </arr>
  </requestHandler>

  <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">textSpell</str>
    <lst name="spellchecker">
      <str name="name">a_spell</str>
      <str name="field">a_spell</str>
      <str name="spellcheckIndexDir">./spellchecker_a_spell</str>
      <str name="accuracy">0.7</str>
    </lst>
    <lst name="spellchecker">
      <str name="name">jarowinkler</str>
      <str name="field">a_spell</str>
      <!-- Use a different Distance Measure -->
      <str
name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
      <str name="spellcheckIndexDir">./spellchecker_a_spell</str>
      <str name="accuracy">0.7</str>
    </lst>
  </searchComponent>

--
Thanks
Varun Gupta

Re: SpellCheck Index not building

Posted by Varun Gupta <va...@gmail.com>.
No, there are no exceptions in the logs.

--
Thanks
Varun Gupta

On Tue, Oct 13, 2009 at 8:46 AM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> On Tue, Oct 13, 2009 at 8:36 AM, Varun Gupta <va...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I am using Solr 1.3 for spell checking. I am facing a strange problem of
> > spell checking index not been generated. When I have less number of
> > documents (less than 1000) indexed then the spell check index builds, but
> > when the documents are more (around 40K), then the index for spell
> checking
> > does not build. I can see the directory for spell checking build and
> there
> > are two files under it: segments_3  & segments.gen
> >
> >
> It seems that you might be running out of memory with a larger index. Can
> you check the logs to see if it has any exceptions recorded?
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Re: SpellCheck Index not building

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Tue, Oct 13, 2009 at 8:36 AM, Varun Gupta <va...@gmail.com> wrote:

> Hi,
>
> I am using Solr 1.3 for spell checking. I am facing a strange problem of
> spell checking index not been generated. When I have less number of
> documents (less than 1000) indexed then the spell check index builds, but
> when the documents are more (around 40K), then the index for spell checking
> does not build. I can see the directory for spell checking build and there
> are two files under it: segments_3  & segments.gen
>
>
It seems that you might be running out of memory with a larger index. Can
you check the logs to see if it has any exceptions recorded?

-- 
Regards,
Shalin Shekhar Mangar.