You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2019/10/30 23:43:59 UTC
questionable behavior of DirectoryReader.getIndexCommit() on NRT
Readers
TL:DR: it seems like DirectoryReader.getIndexCommit() returns weird
results when using a "reopened" reader off of uncommited IW changes.
Even though 2 diff readers will expose diff views of the index, they will
claim to refer to the same IndexCommit.
Slightly Longer Version...
The test below, if added to TestIndexWriterReader, will pass reliably
AFAICT -- but that seems really weird to me. Note in particular the
'nocommit' comments and the assertions that follow them -- these are very
diff readers, exposing very diff views of the index, yet they claim to
have the same "IndexCommit" & generation underpinning them, evne though
some of the details of their IndexCommits differ
This seems weird, and I'm wondering if it's a bug, or an undespecified
behavior (and we should beef up the docs to clarify what to expect), or if
it represents some "feature" whose value i'm not understanding?
public void testIndexCommitOfReaderAfterReopen() throws Exception {
final Directory dir = newDirectory();
final IndexWriterConfig iwc = new IndexWriterConfig(new MockAnalyzer(random()));
final IndexWriter w = new IndexWriter(dir, iwc);
try {
final DirectoryReader r0 = DirectoryReader.open(w);
try {
assertEquals(0, r0.numDocs());
final IndexCommit c0 = r0.getIndexCommit();
assertEquals(0L, c0.getGeneration());
} finally {
r0.close();
}
w.addDocument(new Document());
w.commit();
final DirectoryReader r1 = DirectoryReader.open(w);
try {
assertEquals(1, r1.numDocs());
final IndexCommit c1 = r1.getIndexCommit();
assertEquals(1L, c1.getGeneration());
assertEquals(1, c1.getSegmentCount());
w.addDocument(new Document());
final DirectoryReader r2 = DirectoryReader.openIfChanged(r1, w, true);
try {
assertNotNull(r2);
assertFalse(r1.equals(r2));
assertEquals(2, r2.numDocs());
// nocommit: Why do the assertions pass?
//
// nocommit: If the readers are not the same, and refer to different
// nocommit: "point in time" views of the index, then shouldn't they
// nocommit: also return different 'getIndexCommit()' results?
//
// nocommit: Should the "realtime" reader return 'null' since it's view
// nocommit: of the index is not represented by a tangible commit?
assertEquals(c1, r2.getIndexCommit());
assertEquals(1L, r2.getIndexCommit().getGeneration());
assertEquals(c1.getSegmentsFileName(), r2.getIndexCommit().getSegmentsFileName());
// nocommit: particularly odd: even though the commits are ".equals()"
// nocommit: and have the same generation & segments file,
// nocommit: they have diff segment counts and files
assertEquals(2, r2.getIndexCommit().getSegmentCount());
assertTrue(! c1.getFileNames().equals(r2.getIndexCommit().getFileNames()));
} finally {
r2.close();
}
} finally {
r1.close();
}
} finally {
w.close();
dir.close();
}
}
Backstory: I'm digging into the Solr snapshot/backup code with an
objective of fixing SOLR-13872, but before i move forward with changing
anything I want to make sure I backfill the existing snapshot/backup test
code with new test cases for all the code paths that seem underrepresented
in tests so i don't risk introducing new bugs. While trying to
understand/test/document how Solr behaves if a user tries to request a
snapshot/backup when using "softCommit" (at the solr level), I noticed
this weird behavior of getting the same IndexCommit back from the
reader even after we'd re-opened a Reader/Searcher w/o committing.
-Hoss
http://www.lucidworks.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org