You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Miroslav Smiljanic (Jira)" <ji...@apache.org> on 2020/03/30 15:19:00 UTC

[jira] [Created] (OAK-8986) Segment flush thread can remanin in TIMED_WAITING state even when segment queue is empty

Miroslav Smiljanic created OAK-8986:
---------------------------------------

             Summary: Segment flush thread can remanin in TIMED_WAITING state even when segment queue is empty
                 Key: OAK-8986
                 URL: https://issues.apache.org/jira/browse/OAK-8986
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: segment-azure
    Affects Versions: 1.26.0, 1.24.0
            Reporter: Miroslav Smiljanic
         Attachments: test.patch

If thread is in interrupted state, during execution of [SegmentWriteQueue. addToQueue |https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L166] InterruptedException will be thrown and wrapped in IOException.

Right befire calling queue.offer, element is added to segmentsByUUID map, and never removed.
 Normally that happens in thread that reads from [queue|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L100], and that invokes [consume(SegmentWriteAction segment).|https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L117]

Since item is not removed form the segmentsByUUID map, [flusher|http://[https://github.com/apache/jackrabbit-oak/blob/jackrabbit-oak-1.26.0/oak-segment-azure/src/main/java/org/apache/jackrabbit/oak/segment/azure/queue/SegmentWriteQueue.java#L183]] thread will remain in TIMED_WAITING state.

TarMK flush thread holds exclusivelly monitor needed by number of other threads, causing repository to be blocked.


{noformat}
"TarMK flush [/opt/aem/launcher/repository/segmentstore-composite-global]" #82 daemon prio=5 os_prio=0 cpu=83628.24ms elapsed=291420.48s tid=0x00007fce902f3000 nid=0x1c2b in Object.wait()  [0x00007fce00aa5000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.3/Native Method)
	- waiting on <no object reference available>
	at org.apache.jackrabbit.oak.segment.azure.queue.SegmentWriteQueue.flush(SegmentWriteQueue.java:183)
	- waiting to re-lock in wait() <0x00000006b4911830> (a java.util.concurrent.ConcurrentHashMap)
	at org.apache.jackrabbit.oak.segment.azure.AzureSegmentArchiveWriter.flush(AzureSegmentArchiveWriter.java:187)
	at org.apache.jackrabbit.oak.segment.file.tar.TarWriter.flush(TarWriter.java:186)
	- locked <0x00000006b4911960> (a java.lang.Object)
	at org.apache.jackrabbit.oak.segment.file.tar.TarFiles.flush(TarFiles.java:535)
	at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$tryFlush$9(FileStore.java:359)
	at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$232/0x000000080067ac40.flush(Unknown Source)
	at org.apache.jackrabbit.oak.segment.file.TarRevisions.doFlush(TarRevisions.java:236)
	at org.apache.jackrabbit.oak.segment.file.TarRevisions.tryFlush(TarRevisions.java:216)
	at org.apache.jackrabbit.oak.segment.file.FileStore.tryFlush(FileStore.java:357)
	at org.apache.jackrabbit.oak.segment.file.FileStore.lambda$new$5(FileStore.java:212)
	at org.apache.jackrabbit.oak.segment.file.FileStore$$Lambda$203/0x000000080064b440.run(Unknown Source)
	at org.apache.jackrabbit.oak.segment.file.SafeRunnable.run(SafeRunnable.java:67)
	at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515)
	at java.util.concurrent.FutureTask.runAndReset(java.base@11.0.3/FutureTask.java:305)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11.0.3/ScheduledThreadPoolExecutor.java:305)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.3/Thread.java:834)

{noformat}

Here is the test case that demonstrates the problem. 

[^test.patch]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)