You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Chetan Mehrotra (JIRA)" <ji...@apache.org> on 2015/03/04 07:06:05 UTC

[jira] [Updated] (OAK-2570) Open indexes in IndexTracker non blocking way

     [ https://issues.apache.org/jira/browse/OAK-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chetan Mehrotra updated OAK-2570:
---------------------------------
    Attachment: OAK-2570.patch

[Attached patch|^OAK-2570.patch] wraps the {{LuceneIndexProvider}} with {{BackgroundObserver}}

[~mduerig] Can you review the usage around {{BackgroundObserver}}

[~alex.parvulescu] This would make writing test bit difficult as it would not be easy to determine when the index has catched up. There are two options
* Disable the usage of {{BackgroundObserver}} via config. So on test setup we set {{enableOpenIndexAsync}} to false
* Use a different way. Currently we record the time at which the index writer is closed. We can surface that information as part of JMX stats for Lucene 

[~mreutegg] [~mduerig] [~alex.parvulescu] [~mmarth] Given that such operation might cause delay in some critical code path I would like to use {{BackgroundObserver}} by default. Or for safety purpose we expose the OSGi config and only enable this for system where such slowness is observed. Later we can enable it by default. On trunk we would keep it enabled

> Open indexes in IndexTracker non blocking way
> ---------------------------------------------
>
>                 Key: OAK-2570
>                 URL: https://issues.apache.org/jira/browse/OAK-2570
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: oak-lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.2, 1.0.12
>
>         Attachments: OAK-2570.patch
>
>
> Currently {{IndexTracker}} opens the indexes in same thread where the commit happens as it acts as an {{Observer}}. If opening the index takes long time it would block the commit thread and might introduce delay in other operation on same thread. 
> For example {{DocumentNodeStore}} invokes the {{Observer}} in {{backgroundRead}} and that call would not complete unless the {{Observer}} returns back. For a system to be responsive the {{backgroundRead}} has to be quick as it holds the write lock. Due to current implementation at time system turn unresponsive. 
> Looking at the threaddump from one such unresponsive system it appears that slowness is due to Lucene index taking time to reopen. 
> Thread which hold the {{backgroundOperationLock}} for {{read}} is 
> {noformat}
> "pool-9-thread-1" Id=101 in RUNNABLE
>     at java.io.FileInputStream.open(Native Method)
>     at java.io.FileInputStream.<init>(FileInputStream.java:146)
>     at org.apache.jackrabbit.core.data.LazyFileInputStream.open(LazyFileInputStream.java:104)
>     at org.apache.jackrabbit.core.data.LazyFileInputStream.read(LazyFileInputStream.java:163)
>     at com.google.common.io.ByteStreams.read(ByteStreams.java:828)
>     at com.google.common.io.ByteStreams.readFully(ByteStreams.java:695)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:218)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
>     at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
>     at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
>     at org.apache.lucene.util.fst.FST.<init>(FST.java:318)
>     at org.apache.lucene.util.fst.FST.<init>(FST.java:304)
>     at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader.<init>(BlockTreeTermsReader.java:484)
>     at org.apache.lucene.codecs.BlockTreeTermsReader.<init>(BlockTreeTermsReader.java:176)
>     at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:437)
>     at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
>     at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
>     at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:62)
>     at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
>     at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
>     at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.<init>(IndexNode.java:94)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:62)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:96)
>     at org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
>     at org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.dispatch(DocumentNodeState.java:433)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.compareAgainstBaseState(DocumentNodeState.java:260)
>     at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
>     at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:105)
>       - locked org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker@6243264e
>     at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:61)
>     at org.apache.jackrabbit.oak.spi.commit.CompositeObserver.contentChanged(CompositeObserver.java:53)
>       - locked org.apache.jackrabbit.oak.spi.commit.CompositeObserver@2687fd48
>     at org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher.contentChanged(ChangeDispatcher.java:79)
>       - locked org.apache.jackrabbit.oak.spi.commit.ChangeDispatcher@2a0595bb
>     at org.apache.jackrabbit.oak.plugins.document.CommitQueue.afterTrunkCommit(CommitQueue.java:131)
>       - locked org.apache.jackrabbit.oak.plugins.document.CommitQueue@54a93d8a
>     at org.apache.jackrabbit.oak.plugins.document.CommitQueue.done(CommitQueue.java:83)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.done(DocumentNodeStore.java:586)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:187)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:91)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:39)
>     at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$InMemory.merge(AbstractNodeStoreBranch.java:525)
>     at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.merge(AbstractNodeStoreBranch.java:318)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.merge(DocumentNodeStoreBranch.java:135)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.merge(DocumentRootBuilder.java:159)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.merge(DocumentNodeStore.java:1339)
>     at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.mergeWithConcurrencyCheck(AsyncIndexUpdate.java:420)
>     at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:389)
>     at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:300)
>       - locked org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@401daa3f
>     at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
>     at org.quartz.core.JobRunShell.run(JobRunShell.java:207)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Further {{DocumentNodeStore background thread}} is also waiting to acquire the write lock
> {noformat}
> "DocumentNodeStore background thread" Id=82 in WAITING on lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@6990d994
>     at sun.misc.Unsafe.park(Native Method)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>     at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>     at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>     at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>     at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>     at org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:156)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:1662)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundOperations(DocumentNodeStore.java:1456)
>       - locked org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore@350b4c49
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundOperation.execute(DocumentNodeStore.java:2064)
>     at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:2039)
>     at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Now as javadoc of {{ReentrantReadWriteLock}} the readLock would not be granted if there is a waiter for write lock. Hence in such a case no further reads would be possible and system would not respond well



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)