You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2023/01/02 09:27:31 UTC

[GitHub] [lucene] sebastiano1972 opened a new issue, #12059: Recurring index corruption

sebastiano1972 opened a new issue, #12059:
URL: https://github.com/apache/lucene/issues/12059

   We are experimenting with Elastic Search deployed in Azure Container Instances (Debian + OpenJDK). The ES indexes are stored into an Azure file share mounted via SMB (3.0). The Elastic Search cluster is made up of 4 nodes, each one have a separate file share to store the indices. We are experiencing recurring index corruption, specifically a "read past EOF" exception. I asked on the Elastic Search forum but the answer I got was a bit generic and not really helpful other than confirming that, from ES point of view, ES should work on an SMB share as long as it behaves as a local drive. As the underlying exception relates to an issue with a Lucene index, I was wondering if you could help out? Specifically, can Lucene work on SMB? I can only find sparse information on this configuration and, while NFS seems a no-no, for SMB is not that clear. Below is the exception we are getting.
   
   Many thanks.
   
   Seb
   
   ```
   java.io.IOException: read past EOF: NIOFSIndexInput(path="/bitnami/elasticsearch/data/indices/mS2bUbLtSeG0FSAMuKX7JQ/0/index/_ldsn_1.fnm") buffer: java.nio.HeapByteBuffer[pos=0 lim=1024 cap=1024] chunkLen: 1024 end: 2331: NIOFSIndexInput(path="/bitnami/elasticsearch/data/indices/mS2bUbLtSeG0FSAMuKX7JQ/0/index/_ldsn_1.fnm")
                 at org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:200) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:291) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:55) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:39) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.codecs.CodecUtil.readBEInt(CodecUtil.java:667) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:184) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:253) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.codecs.lucene90.Lucene90FieldInfosFormat.read(Lucene90FieldInfosFormat.java:128) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.SegmentReader.initFieldInfos(SegmentReader.java:205) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:156) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.ReadersAndUpdates.createNewReaderWithLatestLiveDocs(ReadersAndUpdates.java:738) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.ReadersAndUpdates.swapNewReaderWithLatestLiveDocs(ReadersAndUpdates.java:754) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:678) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.ReaderPool.writeAllDocValuesUpdates(ReaderPool.java:251) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3743) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:591) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:381) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:355) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:345) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:170) ~[lucene-core-9.3.0.jar:?]
                 at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:48) ~[elasticsearch-8.4.1.jar:?]
                 at org.elasticsearch.index.engine.ElasticsearchReaderManager.refreshIfNeeded(ElasticsearchReaderManager.java:27) ~[elasticsearch-8.4.1.jar:?]
                 at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167) ~[lucene-core-9.3.0.jar:?]
                 at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240) ~[lucene-core-9.3.0.jar:?]
                 at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:355) ~[elasticsearch-8.4.1.jar:?]
                 at org.elasticsearch.index.engine.InternalEngine$ExternalReaderManager.refreshIfNeeded(InternalEngine.java:335) ~[elasticsearch-8.4.1.jar:?]
                 at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:167) ~[lucene-core-9.3.0.jar:?]
    
   ```
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler closed issue #12059: Recurring index corruption

Posted by GitBox <gi...@apache.org>.
uschindler closed issue #12059: Recurring index corruption
URL: https://github.com/apache/lucene/issues/12059


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] sebastiano1972 commented on issue #12059: Recurring index corruption

Posted by GitBox <gi...@apache.org>.
sebastiano1972 commented on issue #12059:
URL: https://github.com/apache/lucene/issues/12059#issuecomment-1368804331

   Hi Uwe,
   
   thank you for your kind reply.
   
   To answer your question, we are experimenting with Azure Container Instances, because of their relative simplicity, but they comes with some limitations:
   
   - we cannot set the max_map_count value as we do not have access to the underlying host (https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html). Unfortunately, this is required to run an ES cluster, therefore we were forced to use NIOFS
   - ACIs only allow volume mappings using Azure File Shares, which only works with NFS or SMB.
   
   I will move this on the suggested mailing list.
   
   Thank you again.
   
   Seb
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on issue #12059: Recurring index corruption

Posted by GitBox <gi...@apache.org>.
uschindler commented on issue #12059:
URL: https://github.com/apache/lucene/issues/12059#issuecomment-1368787343

   Hi,
   Samba/CIFS is generally working as file store - in contrast to NFS, which should never ever be used as file system; but the problems you describe here are surely related to problems with a shared file system. Whenever possible please make the store a local disk, don't use shared/network filesystems to store Lucene indexes. We can't give you any recommendations here, there are surely no known bugs in Lucene that could create above index corrumption.
   
   In short: please avoid NFS (under all circumstances) and avoid CIFS (where possible, especially under high load). Shared/network file systems should only be use for read-only indexes.
   
   Related question: we are wondering why you not use MMapDircetory but instead use NIOFSDirectory?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on issue #12059: Recurring index corruption

Posted by GitBox <gi...@apache.org>.
uschindler commented on issue #12059:
URL: https://github.com/apache/lucene/issues/12059#issuecomment-1368788012

   This issue should be better discussed on the mailing list, it is not a bug/iusse at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org