You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2016/04/12 16:07:20 UTC

migrating to 6.0 -- how to apply filter to getSpans

On the living github version of LUCENE-5317, I'm trying to migrate to 6.0, and most is fairly clear.

However, how do I modify the following code to return spans only from documents that match the -Filter- Query.

For each LeafReaderContext, I used to get a DocIdSet, call the iterator on that, and then iterate through the DocIdSetIterator along with the spans to retrieve spans in documents matching the Filter.


for (LeafReaderContext ctx : searcher.getIndexReader().leaves()) {
  DocIdSet filterSet = filter.getDocIdSet(ctx, ctx.reader().getLiveDocs());
  if (filterSet == null) {
    return;
  }

  Spans spans = w.getSpans(ctx, SpanWeight.Postings.POSITIONS);
  if (spans == null) {
    continue;
  }
  DocIdSetIterator filterItr = filterSet.iterator();

.....
}

And then iterate through the filterItr and spans like so...


while (true) {
  if (spansDoc == DocIdSetIterator.NO_MORE_DOCS) {
    break;
  }
  filterDoc = filterItr.advance(spansDoc);
  if (filterDoc == DocIdSetIterator.NO_MORE_DOCS) {
    break;
  } else if (filterDoc > spansDoc) {
    while (spansDoc <= filterDoc) {
      spansDoc = spans.nextDoc();
      if (spansDoc == filterDoc) {
        boolean cont = visit(leafCtx, spans, visitor);
        if (! cont) {
          return false;
        }

      } else {
        continue;
      }
    }
  } else if (filterDoc == spansDoc) {
    boolean cont = visit(leafCtx, spans, visitor);
    if (! cont) {
      return false;
    }
    //then iterate spans
    spansDoc = spans.nextDoc();
  } else if (filterDoc < spansDoc) {
    throw new IllegalArgumentException("FILTER doc is < spansdoc!!!");
  } else {
    throw new IllegalArgumentException("Something horrible happened");
  }



RE: migrating to 6.0 -- how to apply filter to getSpans

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Solution (I think): create a weight for the searcher and then call "scorer" from that for each LeafReaderContext:

Weight searcherWeight = searcher.createWeight(filter, false);
      for (LeafReaderContext ctx : searcher.getIndexReader().leaves()) {
        Scorer leafReaderContextScorer = searcherWeight.scorer(ctx);

        Spans spans = w.getSpans(ctx, SpanWeight.Postings.POSITIONS);
        if (spans == null) {
          continue;
        }
        DocIdSetIterator filterItr = leafReaderContextScorer.iterator();

        if (filterItr == null || filterItr.equals(DocIdSetIterator.empty())) {
          continue;
        }
        boolean cont = visitLeafReader(ctx, spans, filterItr, visitor);
...
    }
-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Tuesday, April 12, 2016 10:07 AM
To: java-user@lucene.apache.org
Subject: migrating to 6.0 -- how to apply filter to getSpans

On the living github version of LUCENE-5317, I'm trying to migrate to 6.0, and most is fairly clear.

However, how do I modify the following code to return spans only from documents that match the -Filter- Query.

For each LeafReaderContext, I used to get a DocIdSet, call the iterator on that, and then iterate through the DocIdSetIterator along with the spans to retrieve spans in documents matching the Filter.


for (LeafReaderContext ctx : searcher.getIndexReader().leaves()) {
  DocIdSet filterSet = filter.getDocIdSet(ctx, ctx.reader().getLiveDocs());
  if (filterSet == null) {
    return;
  }

  Spans spans = w.getSpans(ctx, SpanWeight.Postings.POSITIONS);
  if (spans == null) {
    continue;
  }
  DocIdSetIterator filterItr = filterSet.iterator();

.....
}

And then iterate through the filterItr and spans like so...


while (true) {
  if (spansDoc == DocIdSetIterator.NO_MORE_DOCS) {
    break;
  }
  filterDoc = filterItr.advance(spansDoc);
  if (filterDoc == DocIdSetIterator.NO_MORE_DOCS) {
    break;
  } else if (filterDoc > spansDoc) {
    while (spansDoc <= filterDoc) {
      spansDoc = spans.nextDoc();
      if (spansDoc == filterDoc) {
        boolean cont = visit(leafCtx, spans, visitor);
        if (! cont) {
          return false;
        }

      } else {
        continue;
      }
    }
  } else if (filterDoc == spansDoc) {
    boolean cont = visit(leafCtx, spans, visitor);
    if (! cont) {
      return false;
    }
    //then iterate spans
    spansDoc = spans.nextDoc();
  } else if (filterDoc < spansDoc) {
    throw new IllegalArgumentException("FILTER doc is < spansdoc!!!");
  } else {
    throw new IllegalArgumentException("Something horrible happened");
  }



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org