You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Dodson <mg...@mac.com> on 2006/04/06 14:47:40 UTC

nested phrase queries

Can phrase queries be nested the same way boolean queries can be nested?

I want a user query to be translated into a boolean query (say, x AND  
(y OR z)), and I want  those terms to be within a certain distance of  
each other (approximately within the same sentence, so the slop would  
be about 7).   I then want to find documents where that phrase is  
within a certain distance of another term (in this case an image  
name).  So in pseudo code I would have something like

PhraseQuery query1
Set slop to 7

PhraseQuery query2
Set slop to 50

Add boolean terms to query1
Add query1 and imageName to query2

Search

I think I could break down my initial boolean query and turn query1  
into two phrase queries which are OR'ed together (so have query1a  
with "x AND y" with slop 7, and query1b with "x AND z" and slop 7,  
both added to a BooleanQuery) but I still need to combine that query  
with the image name in query2.

Thanks for the help.

Mike D

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: nested phrase queries

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Seeing this worries me we'll see users creating XML strings, then  
parsing them to get the desired query.  I've seen this lots with  
QueryParser, but it would be even more gross to see folks do this  
with the XML syntax.  So, here's my community service message for the  
day.... if you're creating a query in code, don't use a parser, use  
the Query subclasses API directly :)

The XMLQueryParser is a great thing for machine-machine  
communication, though!

	Erik

On Apr 6, 2006, at 10:08 AM, mark harwood wrote:

> The XMLQueryParser in the contrib section also handles
> Spans (as well as a few other Lucene queries/filters
> not represented by the standard QueryParser).
>
> Here's an example of a complex query from the JUnit
> test ....
>
> <?xml version="1.0" encoding="UTF-8"?>
> <SpanOr fieldName="contents">
>  <SpanNear slop="8" inOrder="false" >		
>   <SpanOr>
>    <SpanTerm>killed</SpanTerm>
>    <SpanTerm>died</SpanTerm>
>    <SpanTerm>dead</SpanTerm>
>   </SpanOr>
>  <SpanOr>
>  <!-- a less verbose way of declaring ORed
> SpanTerms-->
>   <SpanOrTerms>miner miners</SpanOrTerms>
>   <SpanNear slop="6" inOrder="false">		
>    <SpanTerm>mine</SpanTerm>					
> <SpanOrTerms>worker workers</SpanOrTerms>
>   </SpanNear>
>  </SpanOr>
>  </SpanNear>	
>   <SpanFirst end="10">
>    <SpanOrTerms>fire burn</SpanOrTerms>
>   </SpanFirst>
> </SpanOr>
>
>
>
>
> --- Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
>>
>> On Apr 6, 2006, at 8:47 AM, Michael Dodson wrote:
>>> Can phrase queries be nested the same way boolean
>> queries can be
>>> nested?
>>
>> Yes... using SpanNearQuery instead of PhraseQuery.
>>
>>> I want a user query to be translated into a
>> boolean query (say, x
>>> AND (y OR z)), and I want  those terms to be
>> within a certain
>>> distance of each other (approximately within the
>> same sentence, so
>>> the slop would be about 7).   I then want to find
>> documents where
>>> that phrase is within a certain distance of
>> another term (in this
>>> case an image name).  So in pseudo code I would
>> have something like
>>>
>>> PhraseQuery query1
>>> Set slop to 7
>>>
>>> PhraseQuery query2
>>> Set slop to 50
>>>
>>> Add boolean terms to query1
>>> Add query1 and imageName to query2
>>>
>>> Search
>>>
>>> I think I could break down my initial boolean
>> query and turn query1
>>> into two phrase queries which are OR'ed together
>> (so have query1a
>>> with "x AND y" with slop 7, and query1b with "x
>> AND z" and slop 7,
>>> both added to a BooleanQuery) but I still need to
>> combine that
>>> query with the image name in query2.
>>
>> QueryParser itself does not support the SpanQuery
>> family, but the
>> surround query parser (see contrib/surround in the
>> codebase) does
>> using an alternate syntax.  So depending on your
>> needs, you may need
>> to create some sort of parser to allow humans to
>> enter such queries.
>>
>> 	Erik
>>
>>
>>
> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>>
>>
>
>
>
> 		
> ___________________________________________________________
> To help you stay safe and secure online, we've developed the all  
> new Yahoo! Security Centre. http://uk.security.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: nested phrase queries

Posted by mark harwood <ma...@yahoo.co.uk>.
The XMLQueryParser in the contrib section also handles
Spans (as well as a few other Lucene queries/filters
not represented by the standard QueryParser).

Here's an example of a complex query from the JUnit
test ....

<?xml version="1.0" encoding="UTF-8"?>
<SpanOr fieldName="contents">
 <SpanNear slop="8" inOrder="false" >		
  <SpanOr>
   <SpanTerm>killed</SpanTerm>
   <SpanTerm>died</SpanTerm>
   <SpanTerm>dead</SpanTerm>
  </SpanOr>
 <SpanOr>
 <!-- a less verbose way of declaring ORed
SpanTerms-->
  <SpanOrTerms>miner miners</SpanOrTerms>
  <SpanNear slop="6" inOrder="false">		
   <SpanTerm>mine</SpanTerm>					  
<SpanOrTerms>worker workers</SpanOrTerms>
  </SpanNear>
 </SpanOr>
 </SpanNear>	
  <SpanFirst end="10">
   <SpanOrTerms>fire burn</SpanOrTerms>
  </SpanFirst> 
</SpanOr>




--- Erik Hatcher <er...@ehatchersolutions.com> wrote:

> 
> On Apr 6, 2006, at 8:47 AM, Michael Dodson wrote:
> > Can phrase queries be nested the same way boolean
> queries can be  
> > nested?
> 
> Yes... using SpanNearQuery instead of PhraseQuery.
> 
> > I want a user query to be translated into a
> boolean query (say, x  
> > AND (y OR z)), and I want  those terms to be
> within a certain  
> > distance of each other (approximately within the
> same sentence, so  
> > the slop would be about 7).   I then want to find
> documents where  
> > that phrase is within a certain distance of
> another term (in this  
> > case an image name).  So in pseudo code I would
> have something like
> >
> > PhraseQuery query1
> > Set slop to 7
> >
> > PhraseQuery query2
> > Set slop to 50
> >
> > Add boolean terms to query1
> > Add query1 and imageName to query2
> >
> > Search
> >
> > I think I could break down my initial boolean
> query and turn query1  
> > into two phrase queries which are OR'ed together
> (so have query1a  
> > with "x AND y" with slop 7, and query1b with "x
> AND z" and slop 7,  
> > both added to a BooleanQuery) but I still need to
> combine that  
> > query with the image name in query2.
> 
> QueryParser itself does not support the SpanQuery
> family, but the  
> surround query parser (see contrib/surround in the
> codebase) does  
> using an alternate syntax.  So depending on your
> needs, you may need  
> to create some sort of parser to allow humans to
> enter such queries.
> 
> 	Erik
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 



		
___________________________________________________________ 
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: nested phrase queries

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Apr 6, 2006, at 8:47 AM, Michael Dodson wrote:
> Can phrase queries be nested the same way boolean queries can be  
> nested?

Yes... using SpanNearQuery instead of PhraseQuery.

> I want a user query to be translated into a boolean query (say, x  
> AND (y OR z)), and I want  those terms to be within a certain  
> distance of each other (approximately within the same sentence, so  
> the slop would be about 7).   I then want to find documents where  
> that phrase is within a certain distance of another term (in this  
> case an image name).  So in pseudo code I would have something like
>
> PhraseQuery query1
> Set slop to 7
>
> PhraseQuery query2
> Set slop to 50
>
> Add boolean terms to query1
> Add query1 and imageName to query2
>
> Search
>
> I think I could break down my initial boolean query and turn query1  
> into two phrase queries which are OR'ed together (so have query1a  
> with "x AND y" with slop 7, and query1b with "x AND z" and slop 7,  
> both added to a BooleanQuery) but I still need to combine that  
> query with the image name in query2.

QueryParser itself does not support the SpanQuery family, but the  
surround query parser (see contrib/surround in the codebase) does  
using an alternate syntax.  So depending on your needs, you may need  
to create some sort of parser to allow humans to enter such queries.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org