You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrés de la Peña (JIRA)" <ji...@apache.org> on 2016/04/26 11:27:12 UTC

[jira] [Created] (LUCENE-7255) Paging with SortingMergePolicy and EarlyTerminatingSortingCollector

Andrés de la Peña created LUCENE-7255:
-----------------------------------------

             Summary: Paging with SortingMergePolicy and EarlyTerminatingSortingCollector
                 Key: LUCENE-7255
                 URL: https://issues.apache.org/jira/browse/LUCENE-7255
             Project: Lucene - Core
          Issue Type: Bug
    Affects Versions: 5.5, 5.4, 5.3, 6.0
            Reporter: Andrés de la Peña


{{EarlyTerminatingSortingCollector}} seems to don't work when used with a {{TopDocsCollector}} searching for documents after a certain {{FieldDoc}}. That is, it can't be used for paging. The following code allows to reproduce the problem:
{code}
// Sort to be used both with merge policy and queries
Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME, SortField.Type.INT));

// Create directory
RAMDirectory directory = new RAMDirectory();

// Setup merge policy
TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
SortingMergePolicy sortingMergePolicy = new SortingMergePolicy(tieredMergePolicy, sort);

// Setup index writer
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new SimpleAnalyzer());
indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
indexWriterConfig.setMergePolicy(sortingMergePolicy);
IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);

// Index values
for (int i = 1; i <= 1000; i++) {
    Document document = new Document();
    document.add(new NumericDocValuesField(FIELD_NAME, i));
    indexWriter.addDocument(document);
}

// Force index merge to ensure early termination
indexWriter.forceMerge(1, true);
indexWriter.commit();

// Create index searcher
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);

// Paginated read
int pageSize = 10;
FieldDoc pageStart = null;
while (true) {

    logger.info("Collecting page starting at: {}", pageStart);

    Query query = new MatchAllDocsQuery();

    TopDocsCollector tfc = TopFieldCollector.create(sort, pageSize, pageStart, true, false, false);
    EarlyTerminatingSortingCollector collector = new EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
    searcher.search(query, collector);
    ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
    for (ScoreDoc scoreDoc : scoreDocs) {
        pageStart = (FieldDoc) scoreDoc;
        logger.info("FOUND {}", scoreDoc);
    }

    logger.info("Terminated early: {}", collector.terminatedEarly());

    if (scoreDocs.length < pageSize) break;
}

// Close
reader.close();
indexWriter.close();
directory.close();
{code}
The query for the second page doesn't return any results. However, it gets the expected results when if we don't wrap the {{TopFieldCollector}} with the {{EarlyTerminatingSortingCollector}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org