You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael McCandless (Jira)" <ji...@apache.org> on 2021/04/05 14:00:00 UTC

[jira] [Commented] (LUCENE-9889) Lucene (unexpected ) fsync on existing segments

    [ https://issues.apache.org/jira/browse/LUCENE-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314885#comment-17314885 ] 

Michael McCandless commented on LUCENE-9889:
--------------------------------------------

Thank you for opening this issue [~rahul196452@gmail.com]!

It is indeed weird that Lucene is re-opening segment files it already long ago wrote and close and fsync'd, to fsync them again.

There is some fun/exciting history here.  Long ago Lucene's {{IndexWriter}} used to keep track of which files were "dirty" (written recently and not yet fsync'd), but that was somehow complex and buggy and sometimes sprouted up bad memory leaks, and so at one point we moved that tracking from {{IndexWriter}} down into {{FSDirectory}}, but then somehow, later, we eventually just removed the dirty logic from {{FSDirectory}} and changed to always fsync'ing every file.  I agree this is odd and we should perhaps revisit that dirty logic.

Related issues: LUCENE-3237, LUCENE-5570, LUCENE-5588, LUCENE-6150 (this is where the dirty file tracking was removed from {{FSDirectory}}).

> Lucene (unexpected ) fsync on existing segments
> -----------------------------------------------
>
>                 Key: LUCENE-9889
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9889
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 7.7.2
>            Reporter: Rahul Goswami
>            Priority: Major
>
>  
> If one of the existing segment files is opened by another (say a 3rd party) process, it can causing a parallel commit to fail with an error complaining about the index files to be locked by another process. Upon debugging, I see that fsync is being called during commit on already existing segment files, and failure to open the file in write mode causes this. But this should not be an expected behavior since there is no reason for a commit to open an existing segment file in WRITE mode to fsync. Please note that in this case, the index file was also a part of a saved commit point, so there is all the more reason to not fsync it.    
>  
> The line of code I am referring to is as below:
> try (final FileChannel file = FileChannel.open(fileToSync, isDir ? StandardOpenOption.READ : StandardOpenOption.WRITE))
>  
> in method fsync(Path fileToSync, boolean isDir) of the class file
>  
> lucene\core\src\java\org\apache\lucene\util\IOUtils.java
>  
>  
> Opening this Jira after discussion with Mike Candless and Michael Sokolov on the dev mailing list here:
> [Lucene - Java Developer - Lucene (unexpected ) fsync on existing segments (nabble.com)|https://lucene.472066.n3.nabble.com/Lucene-unexpected-fsync-on-existing-segments-td4469731.html]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org