You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2012/03/09 01:18:01 UTC

Re: maxClauseCount Exception

:   I am suddenly getting a maxClauseCount exception for no reason. I am
: using Solr 3.5. I have only 206 documents in my index.

Unless things have changed the reason you are seeing this is because 
_highlighting_ a query (clause) like "type_s:[*+TO+*]" requires rewriting 
it into a giant boolean query of all the terms in that field -- so even if 
you only have 206 docs, if you have more then 206 values in that field in 
your index, you're going to go over 1024 terms.

(you don't get this problem in a basic query, because it doens't need to 
enumerate all the terms, it rewrites it to a ConstantScoreQuery)

what you most likeley want to do, is move some of those clauses like 
"type_s:[*+TO+*]: and "usergroup_sm:admin") out of your main "q" query and 
into "fq" filters ... so they can be cached independently, won't 
contribute to scoring (just matching) and won't be used in highlighting.

: params={hl=true&hl.snippets=4&hl.simple.pre=<b></b>&fl=*,score&hl.mergeContiguous=true&hl.usePhraseHighlighter=true&hl.requireFieldMatch=true&echoParams=all&hl.fl=text_t&q={!lucene+q.op%3DOR+df%3Dtext_t}+(+kind_s:doc+OR+kind_s:xml)+AND+(type_s:[*+TO+*])+AND+(usergroup_sm:admin)&rows=20&start=0&wt=javabin&version=2} hits=204 status=500 QTime=166 |#]

: [#|2012-02-22T13:40:13.131-0500|SEVERE|glassfish3.1.1|
: org.apache.solr.servlet.SolrDispatchFilter|
: _ThreadID=22;_ThreadName=Thread-2;|org.apache.lucene.search.BooleanQuery
: $TooManyClauses: maxClauseCount is set to 1024
: 	at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:136)
	...
: 	at
: org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:304)
: 	at
: org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:158)

-Hoss

Re: maxClauseCount Exception

Posted by Erick Erickson <er...@gmail.com>.
bq: So all I want to do is a simple "all docs with something in this field,
and to highlight the field"

But that doesn't really make sense to do at the Solr/Lucene level. All
you're saying is that you want that field highlighted. Wouldn't it be much
easier to just do this at the app level whenever your field had anything
returned in it?

Best
Erick

On Sat, Mar 17, 2012 at 8:07 PM, Darren Govoni <da...@ontrenet.com> wrote:
> Thanks for the tip Hoss.
>
> I notice that it appears sometimes and was varying because my index runs
> would sometimes have different amount of docs, etc.
>
> So all I want to do is a simple "all docs with something in this field,
> and to highlight the field".
>
> Is the query expansion to "all possible terms in the index" really
> necessary? I could have 100's of thousands of possible terms. Why should
> they all become explicit query elements? Seems overkill and
> underperformant.
>
> Is there a another way with Lucene or not really?
>
> On Thu, 2012-03-08 at 16:18 -0800, Chris Hostetter wrote:
>> :   I am suddenly getting a maxClauseCount exception for no reason. I am
>> : using Solr 3.5. I have only 206 documents in my index.
>>
>> Unless things have changed the reason you are seeing this is because
>> _highlighting_ a query (clause) like "type_s:[*+TO+*]" requires rewriting
>> it into a giant boolean query of all the terms in that field -- so even if
>> you only have 206 docs, if you have more then 206 values in that field in
>> your index, you're going to go over 1024 terms.
>>
>> (you don't get this problem in a basic query, because it doens't need to
>> enumerate all the terms, it rewrites it to a ConstantScoreQuery)
>>
>> what you most likeley want to do, is move some of those clauses like
>> "type_s:[*+TO+*]: and "usergroup_sm:admin") out of your main "q" query and
>> into "fq" filters ... so they can be cached independently, won't
>> contribute to scoring (just matching) and won't be used in highlighting.
>>
>> : params={hl=true&hl.snippets=4&hl.simple.pre=<b></b>&fl=*,score&hl.mergeContiguous=true&hl.usePhraseHighlighter=true&hl.requireFieldMatch=true&echoParams=all&hl.fl=text_t&q={!lucene+q.op%3DOR+df%3Dtext_t}+(+kind_s:doc+OR+kind_s:xml)+AND+(type_s:[*+TO+*])+AND+(usergroup_sm:admin)&rows=20&start=0&wt=javabin&version=2} hits=204 status=500 QTime=166 |#]
>>
>> : [#|2012-02-22T13:40:13.131-0500|SEVERE|glassfish3.1.1|
>> : org.apache.solr.servlet.SolrDispatchFilter|
>> : _ThreadID=22;_ThreadName=Thread-2;|org.apache.lucene.search.BooleanQuery
>> : $TooManyClauses: maxClauseCount is set to 1024
>> :     at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:136)
>>       ...
>> :     at
>> : org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:304)
>> :     at
>> : org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:158)
>>
>> -Hoss
>>
>
>

Re: maxClauseCount Exception

Posted by Darren Govoni <da...@ontrenet.com>.
Thanks for the tip Hoss.

I notice that it appears sometimes and was varying because my index runs
would sometimes have different amount of docs, etc.

So all I want to do is a simple "all docs with something in this field,
and to highlight the field". 

Is the query expansion to "all possible terms in the index" really
necessary? I could have 100's of thousands of possible terms. Why should
they all become explicit query elements? Seems overkill and
underperformant.

Is there a another way with Lucene or not really?

On Thu, 2012-03-08 at 16:18 -0800, Chris Hostetter wrote:
> :   I am suddenly getting a maxClauseCount exception for no reason. I am
> : using Solr 3.5. I have only 206 documents in my index.
> 
> Unless things have changed the reason you are seeing this is because 
> _highlighting_ a query (clause) like "type_s:[*+TO+*]" requires rewriting 
> it into a giant boolean query of all the terms in that field -- so even if 
> you only have 206 docs, if you have more then 206 values in that field in 
> your index, you're going to go over 1024 terms.
> 
> (you don't get this problem in a basic query, because it doens't need to 
> enumerate all the terms, it rewrites it to a ConstantScoreQuery)
> 
> what you most likeley want to do, is move some of those clauses like 
> "type_s:[*+TO+*]: and "usergroup_sm:admin") out of your main "q" query and 
> into "fq" filters ... so they can be cached independently, won't 
> contribute to scoring (just matching) and won't be used in highlighting.
> 
> : params={hl=true&hl.snippets=4&hl.simple.pre=<b></b>&fl=*,score&hl.mergeContiguous=true&hl.usePhraseHighlighter=true&hl.requireFieldMatch=true&echoParams=all&hl.fl=text_t&q={!lucene+q.op%3DOR+df%3Dtext_t}+(+kind_s:doc+OR+kind_s:xml)+AND+(type_s:[*+TO+*])+AND+(usergroup_sm:admin)&rows=20&start=0&wt=javabin&version=2} hits=204 status=500 QTime=166 |#]
> 
> : [#|2012-02-22T13:40:13.131-0500|SEVERE|glassfish3.1.1|
> : org.apache.solr.servlet.SolrDispatchFilter|
> : _ThreadID=22;_ThreadName=Thread-2;|org.apache.lucene.search.BooleanQuery
> : $TooManyClauses: maxClauseCount is set to 1024
> : 	at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:136)
> 	...
> : 	at
> : org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:304)
> : 	at
> : org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:158)
> 
> -Hoss
>