You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Chris M. Hostetter (Jira)" <ji...@apache.org> on 2019/11/08 18:17:00 UTC
[jira] [Created] (SOLR-13908) Possible bugs when using
HdfsDirectoryFactory w/ softCommit=true + openSearcher=true
Chris M. Hostetter created SOLR-13908:
-----------------------------------------
Summary: Possible bugs when using HdfsDirectoryFactory w/ softCommit=true + openSearcher=true
Key: SOLR-13908
URL: https://issues.apache.org/jira/browse/SOLR-13908
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: hdfs
Reporter: Chris M. Hostetter
While working on SOLR-13872 something caught my eye that seems fishy....
*Background:*
SOLR-4916 introduced the API {{DirectoryFactory.searchersReserveCommitPoints()}} -- a method that {{SolrIndexSearcher}} uses to decide if it needs to explicitly save/release the {{IndexCommit}} point of it's {{DirectoryReader}} with the {{IndexDeletionPolicytWrapper}}, for use on Filesystems that don't in some way "protect" open files...
{code:title=SolrIndexSearcher}
if (directoryFactory.searchersReserveCommitPoints()) {
// reserve commit point for life of searcher
core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration());
}
{code}
{code:title=DirectoryFactory}
/**
* If your implementation can count on delete-on-last-close semantics
* or throws an exception when trying to remove a file in use, return
* false (eg NFS). Otherwise, return true. Defaults to returning false.
*
* @return true if factory impl requires that Searcher's explicitly
* reserve commit points.
*/
public boolean searchersReserveCommitPoints() {
return false;
}
{code}
{{HdfsDirectoryFactory}} is (still) the only {{DirectoryFactory}} Impl that returns {{true}}.
----
*Concern:*
As noted in LUCENE-9040 The behavior of {{DirectoryReader.getIndexCommit()}} is a little weird / underspecified when dealing with an "NRT" {{IndexReader}} (opened directly off of an {{IndexWriter}} using "un-committed" changes) ... which is exactly what {{SolrIndexSearcher}} is using in solr setups that use {{softCommit=true&openSearcher=false}}.
In particular the {{IndexCommit.getGeneration()}} value that will be used when {{SolrIndexSearcher}} executes {{core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration());}} will be (as of the current code) the {{generation}} of the last _hard_ commit -- meaning that new segment/data files since the last "hard commit" will not be protected from deletion if additional commits/merges happen on the index duringthe life of the {{SolrIndexSearcher}} -- either view concurrent rapid commits, or via {{commit=true&softCommit=false&openSearcher=false}}.
I have not investigated this in depth, but I believe there is risk here of unpredictible bugs when using HDFS in conjunction with {{softCommit=true&openSearcher=true}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org