You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Torsten Bøgh Köster (Jira)" <ji...@apache.org> on 2022/10/26 16:24:00 UTC

[jira] [Commented] (SOLR-16497) Allow finer grained locking in SolrCores

    [ https://issues.apache.org/jira/browse/SOLR-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17624586#comment-17624586 ] 

Torsten Bøgh Köster commented on SOLR-16497:
--------------------------------------------

The GitHub PR is currently pending. We have the changes running in production in Solr 8.11.2 but adaption to Solr 9.x is taking more time than expected. We'll keep you updated!

> Allow finer grained locking in SolrCores
> ----------------------------------------
>
>                 Key: SOLR-16497
>                 URL: https://issues.apache.org/jira/browse/SOLR-16497
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 9.0, 8.11.2
>            Reporter: Torsten Bøgh Köster
>            Priority: Major
>         Attachments: solrcores_locking.png, solrcores_locking_fixed.png
>
>
> Access to loaded SolrCore instances is a synchronized read and write operation in SolrCores#getCoreFromAnyList. This method is touched by every request as every HTTP request is assigned the SolrCore it operates on.
> h3. Background
> Under heavy load we discovered that application halts inside of Solr are becoming a serious problem in high traffic environments. Using Java Flight Recordings we discovered high accumulated applications halts on the modifyLock in SolrCores. In our case this means that we can only utilize our machines up to 25% cpu usage. With the fix applied, a utilization up to 80% is perfectly doable.
> In our case this specific locking problem was masked by another locking problem in the SlowCompositeReaderWrapper. We'll submit our fix to the locking problem in the SlowCompositeReaderWrapper in a following issue.
> h3. Problem
> Our Solr instances utilizes the collapse component heavily. The instances run with 32 cores and 32gb Java heap on a rather small index (4gb). The instances scale out at 50% cpu load. We take Java Flight Recorder snapshots of 60 seconds
> as soon the cpu usage exceeds 50%.
>  !solrcores_locking.png|height=1024px! 
> During our 60s Java Flight Recorder snapshot, the ~2k Jetty acceptor threads accumulated more than 12h locking time inside SolrCores on the modifyLock instance used as synchronized lock (see screenshot). With this fix the locking access is reduced to write accesses only. We validated this using another JFR snapshot:
>  !solrcores_locking_fixed.png|height=1024px! 
> We ran this code for a couple of weeks in our live environment.
> h3. Implementation
> The synchronized modifyLock is replaced by a ReentrantReadWriteLock. This allows concurrent reads from the internal SolrCore instance list (cores) but grants exclusive access to write operations.
> We need to ensure that only a single transientSolrCoreCache inside TransientSolrCoreCacheFactoryDefault is created. As we now allow multiple read threads, we call the the getTransientCacheHandler() method initially inside a write lock inside the load() method. This ensures that a single instance of transientSolrCoreCache is created.
> The lock signaling between SolrCore and CoreContainer gets replaced by a
> Condition that is tied to the write lock.
> h3. Summary
> This change allows for a finer grained access to the list of open SolrCores. The decreased blocking read access is noticeable in decreased blocking times of the Solr application (see screenshot).
> This change has been composed together by Dennis Berger, Torsten Bøgh Köster and Marco Petris.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org