You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Adrien Grand (Jira)" <ji...@apache.org> on 2020/07/15 06:56:00 UTC

[jira] [Resolved] (LUCENE-9428) merge index failed with checksum failed (hardware problem?)

     [ https://issues.apache.org/jira/browse/LUCENE-9428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adrien Grand resolved LUCENE-9428.
----------------------------------
    Resolution: Invalid

When seeing this error message, chances that this is due to a bug in Lucene are very small (which is exactly why we added checksums to index files, so that we could distinguish bugs in Lucene from hardware issues or bugs beneath Lucene: JDK, kernel). Even if your disks don't report an error, there is always a possibility of a silent corruption. I would suggest checking your RAM and making sure that you are on an up-to-date kernel and JDK too.

> merge index failed with checksum failed (hardware problem?)
> -----------------------------------------------------------
>
>                 Key: LUCENE-9428
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9428
>             Project: Lucene - Core
>          Issue Type: Bug
>         Environment: lucene version:5.5.4
> jdk version :jdk1.8-1.8.0_231-fcs
>            Reporter: AllenL
>            Priority: Major
>
> Recently, a procedure using ElasticSearch appeared merge Index Failed with the following exception information
>  
> {code:java}
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] [st-sess][4] failed to merge
> [2020-07-03 13:37:34,113][ERROR][index.engine             ] [Deathbird] [st-sess][4] failed to mergeorg.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim"))) 
> at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334) at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:451) 
> at org.apache.lucene.codecs.blocktree.BlockTreeTermsReader.checkIntegrity(BlockTreeTermsReader.java:333) 
> at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.checkIntegrity(PerFieldPostingsFormat.java:317) 
> at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:96) 
> at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:193) 
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:95) 
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4086) 
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3666) 
> at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588) 
> at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94) 
> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
> [2020-07-03 13:37:34,203][WARN ][index.engine             ] [Deathbird] [st-sess][4] failed engine [merge failed]org.apache.lucene.index.MergePolicy$MergeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=31f090d9 actual=d9697caa (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/lib/elasticsearch/shterm-17412c54-f974-11e9-9eef-80615f029e06/nodes/0/indices/st-sess/4/index/_3jm_Lucene50_0.tim"))) 
> at org.elasticsearch.index.engine.InternalEngine$EngineMergeScheduler$1.doRun(InternalEngine.java:1237) 
> at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
> at java.lang.Thread.run(Thread.java:748){code}
>  
> The exception shows that it may be a hardware problem. Try to check the hardware and find no exception. Check the command as follows:
>  # check device /dev/sda, /dev/sdb; but finds no hardware errors
>      using command: smartctl --xall /dev/sdx
>  # check message log /var/log/messages, no hardware problem happend
>  # The system has a state detection script, i get the system load recorded is normal, IOwait is very low
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org