You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (Jira)" <ji...@apache.org> on 2020/08/25 08:24:00 UTC

[jira] [Commented] (OAK-9184) Very slow, potential endless loop in LucenePropertyIndex.loadDocs()

    [ https://issues.apache.org/jira/browse/OAK-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183860#comment-17183860 ] 

Thomas Mueller commented on OAK-9184:
-------------------------------------

I want to ensure this loop stops after a configurable time, by default after 2 minutes. That is to ensure very large indexes, combined with slow queries, can not cause issues.

> Very slow, potential endless loop in LucenePropertyIndex.loadDocs()
> -------------------------------------------------------------------
>
>                 Key: OAK-9184
>                 URL: https://issues.apache.org/jira/browse/OAK-9184
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>    Affects Versions: 1.22.4, 1.32.0, 1.8.23
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>            Priority: Major
>
> We found one case of a very slow, possibly endless loop in this method. 
> {noformat}
> java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85) at org.apache.lucene.search.Scorer.score(Scorer.java:65) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243) at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434) at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353)
> {noformat}
> Background: Many threads are waiting to lock here:
> {noformat}
> 	- waiting to lock <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.acquireIndexNode(IndexTracker.java:202)
>     public IndexNode acquireIndexNode(String path) {
>         IndexNodeManager index = indices.get(path);
>         IndexNode indexNode = index != null ? index.acquire() : null;
>         if (indexNode != null) {
>             return indexNode;
>         } else {
>             return findIndexNode(path); <<= synchronized method
>         }
>     }
> {noformat}
> And the thread that is holding that (synchronized) lock on IndexTracker is also waiting to get a write lock here:
> {noformat}
> "oak-lucene-992" #21559 daemon prio=1 os_prio=0 tid=0x00007ed0280ad000 nid=0x1be8 waiting on condition [0x00007fb96c352000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x00007fbedad77210> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> 	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexNodeManager.close(IndexNodeManager.java:174)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.close(IndexTracker.java:104)
> 	- locked <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:114)
> 	- locked <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
>   void close() throws IOException {
>         lock.writeLock().lock(); <<== waiting here
>         try {
>             checkState(!closed);
>             closed = true;
>         } finally {
>             lock.writeLock().unlock();
>         }
> {noformat}
> It looks like a read lock is not released. The places where a read lock is acquired:
> {noformat}
>     IndexNode acquire() {
>         lock.readLock().lock();
>   (released if the method returns null or throws an exception)
>     private void release() {
>         lock.readLock().unlock();
>     }
> {noformat}
> acquire() is called in acquireIndexNode:
> {noformat}
>     public IndexNode acquireIndexNode(String path) {
>         IndexNodeManager index = indices.get(path);
>         IndexNode indexNode = index != null ? index.acquire() : null; <<== here
>         if (indexNode != null) {
>             return indexNode;
>         } else {
>             return findIndexNode(path);
>         }
>     }
>     private synchronized IndexNode findIndexNode(String path) {
>         // Retry the lookup from acquireIndexNode now that we're
>         // synchronized. The acquire() call is guaranteed to succeed
>         // since the close() method is also synchronized.
>         IndexNodeManager index = indices.get(path);
>         if (index != null) {
>             IndexNode indexNode = index.acquire(); <<== here
>             return checkNotNull(indexNode);
>         }
> {noformat}
> This method is called in multiple places (getPlanDescription, getPlans, loadDocs, getSize, getOldFullTextIndexPath, dumpIndexContent, getFieldInfo, getFieldTermPrefixInfo, getIndexedPaths, getIndexStats) but always released in "finally". 
> There is one thread that seems to hold this read lock:
> {noformat}
> "10.13.0.73 [1597750655085] GET /mnt/overlay/granite/ui/content/shell/omnisearch/searchresults.html HTTP/1.1" #10953 prio=5 os_prio=0 tid=0x00007fb97000f000 nid=0x8b59 runnable [0x00007fb96c44c000]
>    java.lang.Thread.State: RUNNABLE
> 	at org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85)
> 	at org.apache.lucene.search.Scorer.score(Scorer.java:65)
> 	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
> 	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
> 	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
> 	at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:339)
> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor$1.hasNext(LucenePropertyIndex.java:1718)
> 	at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
> 	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> 	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> 	at org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216)
> 	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor.hasNext(LucenePropertyIndex.java:1751)
> 	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:432)
> 	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:824)
> 	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:851)
> 	at com.google.common.collect.Iterators$5.hasNext(Iterators.java:542)
> 	at org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.fetchNext(FilterIterators.java:141)
> 	at org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.hasNext(FilterIterators.java:155)
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:98)
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:123)
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:78)
> 	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702)
> 	at org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.size(PrefetchIterator.java:135)
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$3.getSize(QueryResultImpl.java:143)
> 	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl.getRows(QueryResultImpl.java:140)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)