You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Kenny Wong <kw...@proofpoint.com> on 2019/01/24 23:56:13 UTC

Nested SpanQuery issue

Hi,

"one two three four five six"

We are unable to match the above text using the query (small reproducer at the bottom):

    spanNear([spanNear([f:one, spanOr([f:two, f:three])], 1, true), f:five], 1, true)

The human readable form is "one W~1 (two OR three) W~1 five", which reads like ("one" within 1 slop of "two" or "three") and within 1 slop of "five".

We think it should match as "<b>one</b> two <b>three</b> four <b>five</b>", but it seems the inner spanNear sees "one two" as satisfying the criteria and does not consider "three", which is required for an overall match. If we increase the slops to 2, we do get a match. However, a slop of 1 looks sufficient here.

Could this be a bug with SpanNearQuery?

Thank you,
Kenny Wong

public class LuceneTest {

    public static void main(String[] args) throws Exception {
        RAMDirectory mem = new RAMDirectory();
        IndexWriter writer = new IndexWriter(mem,
            new IndexWriterConfig(new WhitespaceAnalyzer()));
        try {
            Document doc = new Document();
            Field f = new TextField("f", "one two three four five six", Store.NO);
            doc.add(f);
            writer.addDocument(doc);
        }
        finally {
            writer.close();
        }

        SpanQuery q = newSpanNear(1,
            newSpanNear(1, newSpanTerm("one"), newSpanOr(newSpanTerm("two"), newSpanTerm("three"))),
            newSpanTerm("five"));

        try (DirectoryReader reader = DirectoryReader.open(mem)) {
            TopDocs topDocs = new IndexSearcher(reader).search(q, 1);
            System.out.println(1 == topDocs.totalHits);
        }
    }

    static SpanQuery newSpanTerm(String text) {
        return new SpanTermQuery(new Term("f", text));
    }

    static SpanQuery newSpanNear(int slop, SpanQuery... clauses) {
        return new SpanNearQuery(clauses, slop, true);
    }

    static SpanQuery newSpanOr(SpanQuery...clauses) {
        return new SpanOrQuery(clauses);
    }
}