You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Carsten Schnober <sc...@ids-mannheim.de> on 2012/12/06 10:54:55 UTC

SpanQuery and Bits

Hi,
I have a problem understanding and applying the BitSets concept in
Lucene 4.0. Unfortunately, there does not seem to be a lot of
documentation about the topic.

The general task is to extract Spans matching a SpanQuery which works
with the following snippet:

for (AtomicReaderContext atomic : reader.getContext().leaves()) {		
  Spans spans = query.getSpans(atomic, new Bits.MatchAllBits(0),
termContexts);
  while (spans.next()) {
    // extract payloads etc.
  }
}

I understand that the acceptDocs parameter in SpanQuery.getSpans()
restricts the search to a set of documents. In the example given above,
it searches all documents (Bits.MatchAllBits), right?

What I would like to do is generate a Bits object that is based on a
BooleanQuery beforehand in order to restrict the search through
getSpans() to a set of documents that contain certain terms.
I also have a MultiReader object that handles multiple indexes.
My intuitive approach would be to apply a QueryWrapperFilter like this:

MultiReader reader = ...
BooleanQuery bq = ...
DocIdSet bitset = ???;
Filter filter = new QueryWrapperFilter(bq);
for (AtomicReaderContext context = reader.getContext().leaves()) {
  filter.getDocIdSet(context, new Bits.MatchAllBits(0))
}

The obvious question is: how do I handle the context bitsets returned by
getDocIdSet() correctly so that I can pass the 'bitset' variable to the
getSpans() call?

Or am I on the wrong path for this kind of problem?
Thanks!
Carsten


-- 
Institut für Deutsche Sprache | http://www.ids-mannheim.de
Projekt KorAP                 | http://korap.ids-mannheim.de
Tel. +49-(0)621-43740789      | schnober@ids-mannheim.de
Korpusanalyseplattform der nächsten Generation
Next Generation Corpus Analysis Platform

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org