You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Benoit Mercier <be...@member.fsf.org> on 2010/03/23 05:58:16 UTC

BooleanQuery and SpanQuery : how to get « combined » spans?

Hi,

I would like to write a query composed of a BooleanQuery (several 
clauses) and a SpanQuery (SpanNearQuery),  where both are mandatory.  
Sounds simple but I have to work on spans returned by this query.

I know that I could use a Filter, but my goal is to get the spans from 
the « combined » query : BooleanQuery + SpanQuery.  Even if I filter my 
BooleanQuery with the SpanQuery, spans returned by the 
SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery.  
Since executing the query is not needed to get spans from a SpanQuery I 
understand this behaviour.

My current implementation first runs the BooleanQuery filtered by the 
SpanQuery.  I then get the spans from the SpanQuery and remove from them 
all docs that are not in the score docs returned by the filtered 
BooleanQuery. Is there a more efficient, simple or clever way to reach 
the same goal?

Thank you very much in advance for your advices.

Best regards,

mercibe

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: BooleanQuery and SpanQuery : how to get « combined » spans?

Posted by Benoit Mercier <be...@member.fsf.org>.
Thank you Grant.  I will try your suggested approach. It confirms to me 
that I wasn't lost too much;-)
mercibe

Grant Ingersoll a écrit :
> On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote:
>
>   
>> Hi,
>>
>> I would like to write a query composed of a BooleanQuery (several clauses) and a SpanQuery (SpanNearQuery),  where both are mandatory.  Sounds simple but I have to work on spans returned by this query.
>>
>> I know that I could use a Filter, but my goal is to get the spans from the « combined » query : BooleanQuery + SpanQuery.  Even if I filter my BooleanQuery with the SpanQuery, spans returned by the SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery.  Since executing the query is not needed to get spans from a SpanQuery I understand this behaviour.
>>
>> My current implementation first runs the BooleanQuery filtered by the SpanQuery.  I then get the spans from the SpanQuery and remove from them all docs that are not in the score docs returned by the filtered BooleanQuery. Is there a more efficient, simple or clever way to reach the same goal?
>>
>> Thank you very much in advance for your advices.
>>     
>
>
> If you are 3.x:
>
> I think maybe you could reverse this around.  Get a filter from your BooleanQuery and get the DocIdSet and then advance through the Spans and the DocIdSetIterator, as they will both be forward facing.  For each span, check to see whether that doc is in the filter or not.
>
> In 2.x, I think on the filter you can get the BitSet and then just directly look up to see if the current span is in the bit set.
>
> In either case, I don't think this will be that big of a performance hit as it is all a forward facing iteration.
>
> -Grant
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   


Re: BooleanQuery and SpanQuery : how to get « combined » spans?

Posted by Grant Ingersoll <gs...@apache.org>.
On Mar 23, 2010, at 12:58 AM, Benoit Mercier wrote:

> Hi,
> 
> I would like to write a query composed of a BooleanQuery (several clauses) and a SpanQuery (SpanNearQuery),  where both are mandatory.  Sounds simple but I have to work on spans returned by this query.
> 
> I know that I could use a Filter, but my goal is to get the spans from the « combined » query : BooleanQuery + SpanQuery.  Even if I filter my BooleanQuery with the SpanQuery, spans returned by the SpanQuery.getSpans(reader) are not « filtered » by the BooleanQuery.  Since executing the query is not needed to get spans from a SpanQuery I understand this behaviour.
> 
> My current implementation first runs the BooleanQuery filtered by the SpanQuery.  I then get the spans from the SpanQuery and remove from them all docs that are not in the score docs returned by the filtered BooleanQuery. Is there a more efficient, simple or clever way to reach the same goal?
> 
> Thank you very much in advance for your advices.


If you are 3.x:

I think maybe you could reverse this around.  Get a filter from your BooleanQuery and get the DocIdSet and then advance through the Spans and the DocIdSetIterator, as they will both be forward facing.  For each span, check to see whether that doc is in the filter or not.

In 2.x, I think on the filter you can get the BitSet and then just directly look up to see if the current span is in the bit set.

In either case, I don't think this will be that big of a performance hit as it is all a forward facing iteration.

-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org