You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Thomas Hoffmann (Jira)" <ji...@apache.org> on 2022/05/20 18:18:00 UTC

[jira] [Created] (LUCENE-10583) Deadlock with MMapDirectory while waitForMerges

Thomas Hoffmann created LUCENE-10583:
----------------------------------------

             Summary: Deadlock with MMapDirectory while waitForMerges
                 Key: LUCENE-10583
                 URL: https://issues.apache.org/jira/browse/LUCENE-10583
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/index
    Affects Versions: 8.11.1
         Environment: Java 17

OS: Windows 2016
            Reporter: Thomas Hoffmann


Hello,

a deadlock situation happened in our application. We are using MMapDirectory on Windows 2016 and got the following stacktrace:
{code:java}
"https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms "https-openssl-nio-443-exec-30" #166 daemon prio=5 os_prio=0 cpu=78703.13ms elapsed=81248.18s tid=0x000000002860af10 nid=0x237c in Object.wait()  [0x00000000413fc000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
    at java.lang.Object.wait(java.base@17.0.2/Native Method)
    - waiting on <no object reference available>
    at org.apache.lucene.index.IndexWriter.doWait(IndexWriter.java:4983)
    - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.waitForMerges(IndexWriter.java:2697)
    - locked <0x00000006ef1fc020> (a org.apache.lucene.index.IndexWriter)
    at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1236)
    at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1278)
    at com.speed4trade.ebs.module.search.SearchService.updateSearchIndex(SearchService.java:1723)
    - locked <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory)
    at com.speed4trade.ebs.module.businessrelations.ticket.TicketChangedListener.postUpdate(TicketChangedListener.java:142)
...{code}
All threads were waiting to lock <0x00000006d5c00208> which got never released.

A lucene thread was also blocked, I dont know if this is relevant:
{code:java}
"Lucene Merge Thread #0" #18466 daemon prio=5 os_prio=0 cpu=15.63ms elapsed=3499.07s tid=0x00000000459453e0 nid=0x1f8 waiting for monitor entry  [0x000000005da9e000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.lucene.store.FSDirectory.deletePendingFiles(FSDirectory.java:346)
    - waiting to lock <0x00000006d5c00208> (a org.apache.lucene.store.MMapDirectory)
    at org.apache.lucene.store.FSDirectory.maybeDeletePendingFiles(FSDirectory.java:363)
    at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:248)
    at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
    at org.apache.lucene.index.ConcurrentMergeScheduler$1.createOutput(ConcurrentMergeScheduler.java:289)
    at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
    at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:121)
    at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:130)
    at org.apache.lucene.codecs.lucene87.Lucene87StoredFieldsFormat.fieldsWriter(Lucene87StoredFieldsFormat.java:141)
    at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:227)
    at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:105)
    at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
    at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
    at org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
    at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
    at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684){code}
If looks like the merge operation never finished and released the lock.

Is there any option to prevent this deadlock or how to investigate it further?
A load-test didn't show this problem unfortunately.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org