You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by adm1n <ev...@gmail.com> on 2013/05/22 14:43:17 UTC

too many boolean clauses

I got:
SyntaxError: Cannot parse
'name:Bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbm'

Using solr 4.21
name field type def:

<fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100" >
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
                        splitOnNumerics="1" preserveOriginal="1"
types="characters.txt" />
        <filter class="solr.NGramTokenizerFactory" minGramSize="2"
maxGramSize="15"/>
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"
                        splitOnNumerics="1" preserveOriginal="1"
types="characters.txt" />
        <filter class="solr.NGramTokenizerFactory" minGramSize="2"
maxGramSize="15"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
      </analyzer>
    </fieldType>


Any ideas how to fix it?

thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/too-many-boolean-clauses-tp4065288.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: too many boolean clauses

Posted by Shawn Heisey <so...@elyograg.org>.

> Now regarding the maxBooleanClauses - how it effects performance (response
> times, memory usage) when increasing it?

Changing maxBooleanClauses doesn't make any difference at all. Having
thousands of clauses is what makes things run slower and take more memory.
The setting just causes large queries to fail without running. If you need
a query with more than 1024 clauses and there's no other way to do the
job, then you have to increase it.

Thanks,
Shawn

Re: too many boolean clauses

Posted by adm1n <ev...@gmail.com>.

first of all thanks for response!

Regarding two tokenizers - it's ok.
switching to NGramFilterFactory didn't help (though I didn't reindex but
don't think it was needed since switched it into 'query' section).

Now regarding the maxBooleanClauses - how it effects performance (response
times, memory usage) when increasing it?


thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/too-many-boolean-clauses-tp4065288p4065314.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: too many boolean clauses

Posted by Shawn Heisey <so...@elyograg.org>.

On 5/22/2013 6:43 AM, adm1n wrote:
> SyntaxError: Cannot parse
> 'name:Bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbm'

The subject mentions one error, the message says another. If you are
getting too many boolean clauses, then you need to increase the
maxBooleanClauses in your solrconfig.xml file.  The default is 1024:

    <maxBooleanClauses>1024</maxBooleanClauses>

Looking at your analyzer chain, I see two potential problems.

One is that you have two tokenizer factories, though one is specified as
a filter.  I don't know if you can use a tokenizer as a filter - you
might need NGramFilterFactory instead.

If using a tokenizer as a filter actually works, then we run into the
other possible problem: I can imagine that with the input you have
specified, the NGram expansion in your config might balloon that to more
than 1024 tokens, which would exceed the default maxBooleanClauses.

Thanks,
Shawn