You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Kevin Dutcher <kd...@gmail.com> on 2006/02/09 00:52:22 UTC

Too many required clauses for a BooleanQuery

Hey Everyone,

I'm running into the "More than 32 required/prohibited clauses in query"
exception when running a query. I thought I understood the problem but the
following two scenarios confuse me.

1st - No Error
33 required clauses plus additional clauses that are left off b/c they
are the same as the second scenario
=============================================
(categorization:10102617 AND categorization:10102621 AND
categorization:10102625 AND categorization:10102629 AND
categorization:10102633 AND categorization:10102637 AND
categorization:10102641 AND categorization:10102645 AND
categorization:10102649 AND categorization:10102653 AND
categorization:10102657 AND categorization:10102661 AND
categorization:10102665 AND categorization:10102669 AND
categorization:10102673 AND categorization:10102677 AND
categorization:10102681 AND categorization:10102685 AND
categorization:10102689 AND categorization:10102693 AND
categorization:10102697 AND categorization:10102701 AND
categorization:10102705 AND categorization:10102709 AND
categorization:10102713 AND categorization:10102717 AND
categorization:10102721 AND categorization:10102725 AND
categorization:10102729 AND categorization:10102733 AND
categorization:10102737 AND categorization:10102741 AND
categorization:10102745) AND ...

2nd - Error
The 33 required clauses above with the addition of a required
clause that is 3 OR'd clauses
============================================
((categorization:10102405 OR categorization:10102409 OR
categorization:10102413) AND categorization:10102617 AND
categorization:10102621 AND categorization:10102625 AND
categorization:10102629 AND categorization:10102633 AND
categorization:10102637 AND categorization:10102641 AND
categorization:10102645 AND categorization:10102649 AND
categorization:10102653 AND categorization:10102657 AND
categorization:10102661 AND categorization:10102665 AND
categorization:10102669 AND categorization:10102673 AND
categorization:10102677 AND categorization:10102681 AND
categorization:10102685 AND categorization:10102689 AND
categorization:10102693 AND categorization:10102697 AND
categorization:10102701 AND categorization:10102705 AND
categorization:10102709 AND categorization:10102713 AND
categorization:10102717 AND categorization:10102721 AND
categorization:10102725 AND categorization:10102729 AND
categorization:10102733 AND categorization:10102737 AND
categorization:10102741 AND categorization:10102745) AND ...

I can add additional required clauses to the 1st scenario without any
problems. So why am I seeing the error in the second scenario and not the
first? After discovering the error, I expected to see it in the first
scenario also. Is there anyway around this error?

As a side note, it is very unlikely that this will be encountered in the
real world, but b/c we are dealing with content categorization it is still
possible.

Thanks in advance,

Kevin

Re: Too many required clauses for a BooleanQuery

Posted by Paul Elschot <pa...@xs4all.nl>.
On Thursday 09 February 2006 15:25, Kevin Dutcher wrote:
> > I don't know a lot about the error your encountering (or not encountering
> > as the case may be) but please for hte love of all that is sane use a
> > Filter instead of putting all those categories in your Query.
> >
> > Your search performance and your scores will thank you.
> 
> 
> I need all the documents returned from the search and am manipulating the
> results with a custom HitCollector, therefore I can't use filters.

The development version does not have have the "More than 32
required/prohibited clauses in query" exception. You might give it a try.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Too many required clauses for a BooleanQuery

Posted by Kevin Dutcher <kd...@gmail.com>.
Thanks Hoss... You're absolutely right!

Kevin


On 2/9/06, Chris Hostetter <ho...@fucit.org> wrote:
>
>
> : I need all the documents returned from the search and am manipulating
> the
> : results with a custom HitCollector, therefore I can't use filters.
>
> I don't understand this comment.  There are certianly methods in the
> Searchble interface that allow you to use both a Filter and a HitCollector
> together -- as for "need all the documents returned from the search" ...
> I'm not suggesting you filter out any docs your query doesn't allready
> restrict out because of hte required clauses.  I'm just saying that
> instead of a few dozen required clauses, you use a Filter like the one
> previously posted in this thread.  if you need to combine those "required"
> filters with other optional condition,s you cna do that using a
> ChainedFilter (or writting your own custom Filter that unions the BitSets
> yourself)
>
>
> -Hoss
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Too many required clauses for a BooleanQuery

Posted by Chris Hostetter <ho...@fucit.org>.
: I need all the documents returned from the search and am manipulating the
: results with a custom HitCollector, therefore I can't use filters.

I don't understand this comment.  There are certianly methods in the
Searchble interface that allow you to use both a Filter and a HitCollector
together -- as for "need all the documents returned from the search" ...
I'm not suggesting you filter out any docs your query doesn't allready
restrict out because of hte required clauses.  I'm just saying that
instead of a few dozen required clauses, you use a Filter like the one
previously posted in this thread.  if you need to combine those "required"
filters with other optional condition,s you cna do that using a
ChainedFilter (or writting your own custom Filter that unions the BitSets
yourself)


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Too many required clauses for a BooleanQuery

Posted by Kevin Dutcher <kd...@gmail.com>.
> I don't know a lot about the error your encountering (or not encountering
> as the case may be) but please for hte love of all that is sane use a
> Filter instead of putting all those categories in your Query.
>
> Your search performance and your scores will thank you.


I need all the documents returned from the search and am manipulating the
results with a custom HitCollector, therefore I can't use filters.

Kevin

Re: Too many required clauses for a BooleanQuery

Posted by mark harwood <ma...@yahoo.co.uk>.
>for hte love of all
> that is sane use a
> Filter instead of putting all those categories in
> your Query.

Try this one:




package org.apache.lucene.search;

import java.io.IOException;
import java.util.ArrayList;
import java.util.BitSet;
import java.util.Iterator;

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.TermDocs;

/**
 * Constructs a filter for docs matching any of the
terms added to this class
 * @author maharwood
 */
public class TermsFilter extends Filter
{
	ArrayList termsList=new ArrayList();
	
	public void addTerm(Term term)
	{
		termsList.add(term);
	}

	/* (non-Javadoc)
	 * @see
org.apache.lucene.search.Filter#bits(org.apache.lucene.index.IndexReader)
	 */
	public BitSet bits(IndexReader reader) throws
IOException
	{
		BitSet result=new BitSet(reader.maxDoc());
		for (Iterator iter = termsList.iterator();
iter.hasNext();)
		{
			Term term = (Term) iter.next();
			TermDocs td=reader.termDocs(term);
	        while (td.next())
	        {
	            result.set(td.doc());
	        }						
		}
		return result;
	}
}





		
___________________________________________________________ 
NEW Yahoo! Cars - sell your car and browse thousands of new and used cars online! http://uk.cars.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Too many required clauses for a BooleanQuery

Posted by Chris Hostetter <ho...@fucit.org>.
I don't know a lot about the error your encountering (or not encountering
as the case may be) but please for hte love of all that is sane use a
Filter instead of putting all those categories in your Query.

Your search performance and your scores will thank you.

: Date: Wed, 8 Feb 2006 18:52:22 -0500
: From: Kevin Dutcher <kd...@gmail.com>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Too many required clauses for a BooleanQuery
:
: Hey Everyone,
:
: I'm running into the "More than 32 required/prohibited clauses in query"
: exception when running a query. I thought I understood the problem but the
: following two scenarios confuse me.
:
: 1st - No Error
: 33 required clauses plus additional clauses that are left off b/c they
: are the same as the second scenario
: =============================================
: (categorization:10102617 AND categorization:10102621 AND
: categorization:10102625 AND categorization:10102629 AND
: categorization:10102633 AND categorization:10102637 AND
: categorization:10102641 AND categorization:10102645 AND
: categorization:10102649 AND categorization:10102653 AND
: categorization:10102657 AND categorization:10102661 AND
: categorization:10102665 AND categorization:10102669 AND
: categorization:10102673 AND categorization:10102677 AND
: categorization:10102681 AND categorization:10102685 AND
: categorization:10102689 AND categorization:10102693 AND
: categorization:10102697 AND categorization:10102701 AND
: categorization:10102705 AND categorization:10102709 AND
: categorization:10102713 AND categorization:10102717 AND
: categorization:10102721 AND categorization:10102725 AND
: categorization:10102729 AND categorization:10102733 AND
: categorization:10102737 AND categorization:10102741 AND
: categorization:10102745) AND ...
:
: 2nd - Error
: The 33 required clauses above with the addition of a required
: clause that is 3 OR'd clauses
: ============================================
: ((categorization:10102405 OR categorization:10102409 OR
: categorization:10102413) AND categorization:10102617 AND
: categorization:10102621 AND categorization:10102625 AND
: categorization:10102629 AND categorization:10102633 AND
: categorization:10102637 AND categorization:10102641 AND
: categorization:10102645 AND categorization:10102649 AND
: categorization:10102653 AND categorization:10102657 AND
: categorization:10102661 AND categorization:10102665 AND
: categorization:10102669 AND categorization:10102673 AND
: categorization:10102677 AND categorization:10102681 AND
: categorization:10102685 AND categorization:10102689 AND
: categorization:10102693 AND categorization:10102697 AND
: categorization:10102701 AND categorization:10102705 AND
: categorization:10102709 AND categorization:10102713 AND
: categorization:10102717 AND categorization:10102721 AND
: categorization:10102725 AND categorization:10102729 AND
: categorization:10102733 AND categorization:10102737 AND
: categorization:10102741 AND categorization:10102745) AND ...
:
: I can add additional required clauses to the 1st scenario without any
: problems. So why am I seeing the error in the second scenario and not the
: first? After discovering the error, I expected to see it in the first
: scenario also. Is there anyway around this error?
:
: As a side note, it is very unlikely that this will be encountered in the
: real world, but b/c we are dealing with content categorization it is still
: possible.
:
: Thanks in advance,
:
: Kevin
:



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Too many required clauses for a BooleanQuery

Posted by Kevin Dutcher <kd...@gmail.com>.
>
> One more thing: in case these queries are generated, you might
> consider building the corresponding (nested) BooleanQuery yourself
> instead of using the QueryParser.
>
> Regards,
> Paul Elschot



I'll give that a try.  Thanks Paul.

Re: Too many required clauses for a BooleanQuery

Posted by Paul Elschot <pa...@xs4all.nl>.
On Thursday 09 February 2006 00:52, Kevin Dutcher wrote:
> Hey Everyone,
> 
> I'm running into the "More than 32 required/prohibited clauses in query"
> exception when running a query. I thought I understood the problem but the
> following two scenarios confuse me.
> 
> 1st - No Error
> 33 required clauses plus additional clauses that are left off b/c they
> are the same as the second scenario
> =============================================
> (categorization:10102617 AND categorization:10102621 AND
> categorization:10102625 AND categorization:10102629 AND
> categorization:10102633 AND categorization:10102637 AND
> categorization:10102641 AND categorization:10102645 AND
> categorization:10102649 AND categorization:10102653 AND
> categorization:10102657 AND categorization:10102661 AND
> categorization:10102665 AND categorization:10102669 AND
> categorization:10102673 AND categorization:10102677 AND
> categorization:10102681 AND categorization:10102685 AND
> categorization:10102689 AND categorization:10102693 AND
> categorization:10102697 AND categorization:10102701 AND
> categorization:10102705 AND categorization:10102709 AND
> categorization:10102713 AND categorization:10102717 AND
> categorization:10102721 AND categorization:10102725 AND
> categorization:10102729 AND categorization:10102733 AND
> categorization:10102737 AND categorization:10102741 AND
> categorization:10102745) AND ...
> 
> 2nd - Error
> The 33 required clauses above with the addition of a required
> clause that is 3 OR'd clauses
> ============================================
> ((categorization:10102405 OR categorization:10102409 OR
> categorization:10102413) AND categorization:10102617 AND
> categorization:10102621 AND categorization:10102625 AND
> categorization:10102629 AND categorization:10102633 AND
> categorization:10102637 AND categorization:10102641 AND
> categorization:10102645 AND categorization:10102649 AND
> categorization:10102653 AND categorization:10102657 AND
> categorization:10102661 AND categorization:10102665 AND
> categorization:10102669 AND categorization:10102673 AND
> categorization:10102677 AND categorization:10102681 AND
> categorization:10102685 AND categorization:10102689 AND
> categorization:10102693 AND categorization:10102697 AND
> categorization:10102701 AND categorization:10102705 AND
> categorization:10102709 AND categorization:10102713 AND
> categorization:10102717 AND categorization:10102721 AND
> categorization:10102725 AND categorization:10102729 AND
> categorization:10102733 AND categorization:10102737 AND
> categorization:10102741 AND categorization:10102745) AND ...
> 
> I can add additional required clauses to the 1st scenario without any
> problems. So why am I seeing the error in the second scenario and not the
> first? After discovering the error, I expected to see it in the first
> scenario also. Is there anyway around this error?
> 
> As a side note, it is very unlikely that this will be encountered in the
> real world, but b/c we are dealing with content categorization it is still
> possible.

One more thing: in case these queries are generated, you might
consider building the corresponding (nested) BooleanQuery yourself
instead of using the QueryParser.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org