You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Andrés de la Peña (JIRA)" <ji...@apache.org> on 2016/04/26 11:27:12 UTC
[jira] [Created] (LUCENE-7255) Paging with SortingMergePolicy and
EarlyTerminatingSortingCollector
Andrés de la Peña created LUCENE-7255:
-----------------------------------------
Summary: Paging with SortingMergePolicy and EarlyTerminatingSortingCollector
Key: LUCENE-7255
URL: https://issues.apache.org/jira/browse/LUCENE-7255
Project: Lucene - Core
Issue Type: Bug
Affects Versions: 5.5, 5.4, 5.3, 6.0
Reporter: Andrés de la Peña
{{EarlyTerminatingSortingCollector}} seems to don't work when used with a {{TopDocsCollector}} searching for documents after a certain {{FieldDoc}}. That is, it can't be used for paging. The following code allows to reproduce the problem:
{code}
// Sort to be used both with merge policy and queries
Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME, SortField.Type.INT));
// Create directory
RAMDirectory directory = new RAMDirectory();
// Setup merge policy
TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
SortingMergePolicy sortingMergePolicy = new SortingMergePolicy(tieredMergePolicy, sort);
// Setup index writer
IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new SimpleAnalyzer());
indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
indexWriterConfig.setMergePolicy(sortingMergePolicy);
IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);
// Index values
for (int i = 1; i <= 1000; i++) {
Document document = new Document();
document.add(new NumericDocValuesField(FIELD_NAME, i));
indexWriter.addDocument(document);
}
// Force index merge to ensure early termination
indexWriter.forceMerge(1, true);
indexWriter.commit();
// Create index searcher
IndexReader reader = DirectoryReader.open(directory);
IndexSearcher searcher = new IndexSearcher(reader);
// Paginated read
int pageSize = 10;
FieldDoc pageStart = null;
while (true) {
logger.info("Collecting page starting at: {}", pageStart);
Query query = new MatchAllDocsQuery();
TopDocsCollector tfc = TopFieldCollector.create(sort, pageSize, pageStart, true, false, false);
EarlyTerminatingSortingCollector collector = new EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
searcher.search(query, collector);
ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
for (ScoreDoc scoreDoc : scoreDocs) {
pageStart = (FieldDoc) scoreDoc;
logger.info("FOUND {}", scoreDoc);
}
logger.info("Terminated early: {}", collector.terminatedEarly());
if (scoreDocs.length < pageSize) break;
}
// Close
reader.close();
indexWriter.close();
directory.close();
{code}
The query for the second page doesn't return any results. However, it gets the expected results when if we don't wrap the {{TopFieldCollector}} with the {{EarlyTerminatingSortingCollector}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org